Automatic import of classifications

Hi,
thanks a lot for this great piece of software and all the efforts you put into it!

Are there any plans on supporting dynamic import/fetching of classifications?
I’d like to add all ETF holdings to an ETF via classifications. Doing this manually is not an option (e.g. MSCI World contains around 1500 holdings).
So far I didn’t find any free/cheap API which provides this information in an uniform way, but the emmitents provide this information for free.
However, all emmitents are using different formats (e.g. JSON, CSV, XSL, HTML table) and therefore I think its not viable to directly implement this in PP.
Nevertheless, there are a lot of utilities already in place to parse the different formats (JSON, CSV & HTML table), but they don’t work for importing classifications.
Is there a technical reason why this isn’t supported, or was it just not in scope yet?

I know that there are some further obstacles to overcome to get this working:

  1. Different exchanges are using different tickers (e.g. ‘Electronic Arts Inc’ is using ticker ‘EA’ for ‘NASDAQ’, but ‘ERT’ for ‘STU’ [the german exchange in Stuttgart]). I live in Germany so I’d probably assign the ‘ERT’ ticker in PP. If one of my ETFs was also holding this stock, it would probably be mentioned with ticker ‘EA’ instead in the emmitents holdings list, and therefore they couldn’t be matched to calculate the overall portfolio exposure of this stock. To overcome this its either necessary to maintain a list of multiple tickers per stock, or use a different identifier (e.g. ISIN), but this information isn’t provided in each of the emmitents holdings lists and therefore it needs to be gathered elsewhere.
  2. Most emmitents use custom IDs in their URLs. Therefore it would be necessary to utilize the emmitents search function to the the internal ID (e.g. by ticker or ISIN). I think something like this currently isn’t possible in PP.

How can this be integrated in PP?
As I have some experience in web scraping, so I’d love to have a feature which enables creation of custom scrapers.
Something that enables the following:

  • Perform multiple fetches, parse and store the relevant data in variables (e.g. use emmitents search to get internal ID by ticker/ISIN).
  • Use these variables in subsequent fetches (e.g. fetch holdings by emmitents internal ID), parse/map data (e.g. JSON, CSV or HTML table), convert to classification and store it in the security details.
    I don’t know where this could be integrated in PP, but one way would be to introduce an additional tab in the securitiy details (maybe similar to the ‘Current prices’ tab) or integrating it into the ‘Classifications’ tab, by defining scraping profiles elsewhere and selecting them as data source for the classification.

Edit:
I was also thinking about using GoogleSheets as workaround for the fetching part, and providing holdings data in a uniform way in that spreadsheet, but I think that there currently isn’t a way to import classifications from spreadsheet either.

6 Likes

…off classifications AND regions, please! :smiley:

1 Like

Would you be so nice and add a step for step tutorial, e.g. the UIN IE00BJ0KDQ92?
Thank you so much,
Franz

Based on @f_bu and @traits 's work, I have modified the script to use morningstar information. It is then applicable to funds and etfs with an ISIN (not only to iShares ETF), but it does not give the currency information. Nevertheless, it might be useful as it is.

https://github.com/fizban99/pp-portfolio-classifier

4 Likes

Hi guys,

this tool sounds great! I was looking for something like that for a long time.
Unfortunately I have no knowledge about python3 what so ever. Do you think one might be able to get the script running without having experience with python3 ?
I do not have any programming skills but can work myself into something.

I would appreciate a quick assessment. Thanks!
Cheers, Loic

I would say that no programming skills are required to run the script. I would recommend to try it and come back with the problems you encounter so that the instructions can be improved for others in a similar situation.

1 Like

Sounds great!
What’s happening when an asset has already asset allocations assigned? Are existing asset allocations updated?
Is asset allocation and country also set for single company shares? Would be great.
thanks a lot

As with the original @f_bu 's script, it creates a copy of your master file and adds the corresponding taxonomies. It does not update existing taxonomies. The resulting file is not meant to be the master file. You should keep using your original file and generate the new one whenever you want to see the classification. This should be seen as a workaround. The real solution would be to have some kind of integration with PP.
In relation to stocks, unfortunately, it only works with funds and etfs, I am afraid it currently will raise an error with single company shares.

Thanks a lot @fizban. Even when it’s not an overall solution with update capabilities, it sounds like a good workaround. Thank you for your explanation.

The script works very well and without restrictions for ETFs, but if there is at least one stock in your portfolio executing python returns the following error message:

Traceback (most recent call last):
  File "/Users/***/portfolio-classifier.py", line 659, in <module>
    pp_file.add_taxonomy(taxonomy)
  File "/Users/***/portfolio-classifier.py", line 573, in add_taxonomy
    security_h = security.load_holdings()
  File "/Users/***/portfolio-classifier.py", line 345, in load_holdings
    self.holdings.load(isin = self.ISIN, secid = self.secid)
  File "/Users/***/portfolio-classifier.py", line 432, in load
    percentages = [float(value[key][percent_field]) for key in keys]
  File "/Users/***/portfolio-classifier.py", line 432, in <listcomp>
    percentages = [float(value[key][percent_field]) for key in keys]
TypeError: float() argument must be a string or a real number, not 'NoneType'

Is there already a workaround available for portfolios which also contain stocks?

I believe there were some modificadions in my local version to address it. I uploaded it to github. Can you try it?

Thanks for the quick reply. I tried out the new version. Now the following error appears:

Traceback (most recent call last):
File "/Users/***/portfolio-classifier.py", line 670, in <module>
    pp_file.add_taxonomy(taxonomy)
  File "/Users/***/portfolio-classifier.py", line 584, in add_taxonomy
    security_h = security.load_holdings()
  File "/Users/***/portfolio-classifier.py", line 345, in load_holdings
    self.holdings.load(isin = self.ISIN, secid = self.secid)
  File "/Users/***/portfolio-classifier.py", line 457, in load
    categories = [taxonomy['map'][key] for key in keys]
  File "/Users/***/portfolio-classifier.py", line 457, in <listcomp>
    categories = [taxonomy['map'][key] for key in keys]
KeyError: 'CANAssetAllocFixedIncome'

Hello. I tried these at home and i’m a absolute programming beginner. ive installed git for windows and python 3. ive downloaded the program as a zip file. inside the extracted folder i right clicked and started “git bash here” from the context menu. in the next step i tried the command:

pip3 -r requirements.txt

the program answered:
bash: pip3: command not found

what did i do wrong?
greetings

@PhilJosh Can you give an example of an ISIN of a stock that fails? Anyway, I uploaded a new version that should handle better the errors. Besides, now if you edit the isin2secid.json file with a text editor you can modify the mapping to the ISIN of the stocks to “” and they will be skipped. So if a stock has an ISIN “DE0007236101”, the corresponding line would read:

 "DE0007236101": "",

@Koksman Thanks for trying the script. Although it should not be too difficult to make it work, it is difficult to assess your problem without knowing how you installed python 3 on Windows. Anyway, here are a few tips:
1.- I will assume that you installed python from the official Windows installer at Python Release Python 3.10.6 | Python.org
2.- I will also assume you allowed the default installation of the py launcher but you did not check the option to modify the PATH. This prevents you from running python or pip directly from the command line, but it is not much of a problem since you still can use the py launcher.
3.- git is not really required here, since you downloaded the program as a zip file. Nevertheless since you have it, you can use the “git bash here” option to get a direct shell to that folder. Alternatively, you can SHIFT+right click on the folder and open a Powershell window from the context menu. Although they are different shells, they both allow running the py launcher.
4.- Download again the program as zip file, since the requirements.txt has been updated, unzip it as you did and open a shell in that folder (git bash, command line or powershell)
5.- In whatever shell you use, type

py -m pip install -r requirements.txt

6.- Make a copy of your portfolio.xml into the funds-classifier folder. That way you make sure you do not modify the original.
7.- In the shell, type:

py portfolio-classifier.py <input_file>

being <input_file> the xml file you just copied. The script should generate in the same folder a pp_classified.xml file with the autoclassification.

Let me know how it goes…

1 Like

ive worked through your instructions. the command for the requirements.txt showed me some error lines:

$ py -m pip install -r requirements.txt
Collecting Jinja2==2.11.2
Downloading Jinja2-2.11.2-py2.py3-none-any.whl (125 kB)
-------------------------------------- 125.8/125.8 kB 2.5 MB/s eta 0:00:00
Collecting requests==2.24.0
Downloading requests-2.24.0-py2.py3-none-any.whl (61 kB)
---------------------------------------- 61.8/61.8 kB 3.2 MB/s eta 0:00:00
Collecting requests-cache==0.5.2
Downloading requests_cache-0.5.2-py2.py3-none-any.whl (22 kB)
Collecting jsonpath_ng==1.5.3
Downloading jsonpath_ng-1.5.3-py3-none-any.whl (29 kB)
Collecting markupsafe==1.1.1
Downloading MarkupSafe-1.1.1.tar.gz (19 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status ‘done’
Collecting beautifulsoup4==4.9.3
Downloading beautifulsoup4-4.9.3-py3-none-any.whl (115 kB)
-------------------------------------- 115.8/115.8 kB 7.0 MB/s eta 0:00:00
Collecting idna<3,>=2.5
Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
---------------------------------------- 58.8/58.8 kB 3.0 MB/s eta 0:00:00
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1
Downloading urllib3-1.25.11-py2.py3-none-any.whl (127 kB)
-------------------------------------- 128.0/128.0 kB 3.8 MB/s eta 0:00:00
Collecting certifi>=2017.4.17
Downloading certifi-2022.6.15-py3-none-any.whl (160 kB)
-------------------------------------- 160.2/160.2 kB 4.8 MB/s eta 0:00:00
Collecting chardet<4,>=3.0.2
Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
-------------------------------------- 133.4/133.4 kB 4.0 MB/s eta 0:00:00
Collecting decorator
Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting ply
Downloading ply-3.11-py2.py3-none-any.whl (49 kB)
---------------------------------------- 49.6/49.6 kB 2.6 MB/s eta 0:00:00
Collecting six
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting soupsieve>1.2
Downloading soupsieve-2.3.2.post1-py3-none-any.whl (37 kB)
Using legacy ‘setup.py install’ for markupsafe, since package ‘wheel’ is not installed.
Installing collected packages: ply, chardet, urllib3, soupsieve, six, markupsafe, idna, decorator, certifi, requests, jsonpath_ng, Jinja2, beautifulsoup4, requests-cache
WARNING: The script chardetect.exe is installed in ‘C:\Users\simon\AppData\Local\Programs\Python\Python310\Scripts’ which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Running setup.py install for markupsafe: started
Running setup.py install for markupsafe: finished with status ‘done’
WARNING: The script jsonpath_ng.exe is installed in ‘C:\Users\simon\AppData\Local\Programs\Python\Python310\Scripts’ which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed Jinja2-2.11.2 beautifulsoup4-4.9.3 certifi-2022.6.15 chardet-3.0.4 decorator-5.1.1 idna-2.10 jsonpath_ng-1.5.3 markupsafe-1.1.1 ply-3.11 requests-2.24.0 requests-cache-0.5.2 six-1.16.0 soupsieve-2.3.2.post1 urllib3-1.25.11

[notice] A new release of pip available: 22.2.1 → 22.2.2
[notice] To update, run: C:\Users\simon\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip

after that i made a copy of the portfolio.xml into the pp-portfolio-classifier-main foulder. the next command with the input_file gave another error. ive checked the name of the file:

$ py portfolio-classifier.py Depot_Classifier
Traceback (most recent call last):
File “C:\Users\simon\HiDrive\Dokumente\Bank\pp-portfolio-classifier-main\portfolio-classifier.py”, line 682, in
pp_file = PortfolioPerformanceFile(args.input_file)
File “C:\Users\simon\HiDrive\Dokumente\Bank\pp-portfolio-classifier-main\portfolio-classifier.py”, line 523, in init
self.pp_tree = ET.parse(filepath)
File “C:\Users\simon\AppData\Local\Programs\Python\Python310\lib\xml\etree\ElementTree.py”, line 1222, in parse
tree.parse(source, parser)
File “C:\Users\simon\AppData\Local\Programs\Python\Python310\lib\xml\etree\ElementTree.py”, line 569, in parse
source = open(source, “rb”)
FileNotFoundError: [Errno 2] No such file or directory: ‘Depot_Classifier’

greetings

ive found the error. i repeated with the ending .xml after the input file. Know it worked as it should. But please let me know if i should care about the errors after the command with the requirements.txt

thank you for your help and the great program.

greetings

@fizban Thanks for your feedback. Now everything is working well.
There is only one problem persisting. This is, when the ISIN field of the security is empty, the script returns an error as well.

@fizban And one more thing: The script works well, also if there are single securities/stocks included in the portfolio. Nevertheless it does not consider already existing/assigned classifications of these stocks in the the newly calculated assignments in the “pp_classified.xml” file.

Is there an option to make this possible?

So let’s say I have already assigned “Country–>UnitedStates” to a single security. Then it should be possible that this assignment is considered within the newly calculated classification “Country” as “Country–>UnitedStates”.

Sorry for the late reply. Just to save me some time looking for a security without ISIN, can you give an example?
Regarding pre-existing categories, I guess a possiblity would be to allow for some kind of mapping between existing categories and the new ones. Maybe through an additional configuration file… I’ll have to think about that…

1 Like

I’m holding shares in a bank which is not publicly listed. So it’s a security I’ve created by myself. You may try by creating a random security by yourself. Then you should face the same problem.