Actually, the page you indicate is the one that uses the morningstar api behind the scenes through AJAX. If we check, for example, Vanguard S&P 500 Index ETF (CAD-hedged), which is one that fails through the EMEA-api of mornigstar, we see that it actually uses the US-api of morningstar, so I guess the best solution would be to check first in EMEA, then in US and use the xray as a final fallback, although by using both apis the xray fallback might not be necessary.
Another thing I see now is that while for EMEA it did not seem important to use the etf endpoint or the fund endpoint of the api, for the US you have to use the ETF endpoint for etfs and the fund endpoint for funds, so we have to identify first the type of security…
Sorry, my bad. It seems that in emea the etf vs fund endpoint is also important (at least sometimes). I will have to review the code and see how to handle it. That might actually be the reason for most of the fails…
uff, that sounds complicated.
Perhaps without guessing for the endpoint, just use one endpoint and if the one does not deliver data, use the next one endpoint. And if it delivers data, go to the next security. With log message for each endpoint.
Perhaps with small 200ms between calls, to never get in danger to get flagged (perhaps IP blocked) as request-spam by any Morningstar security rules. I think we can wait a bit for the data, the most important thing is that they are complete at the end.
Thanks a lot for your work
Hello,
I tried to use the script, but get the following errors:
Traceback (most recent call last):
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 703, in
pp_file.add_taxonomy(taxonomy)
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 561, in add_taxonomy
securities = self.get_securities()
^^^^^^^^^^^^^^^^^^^^^
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 663, in get_securities
security_h = security.load_holdings()
^^^^^^^^^^^^^^^^^^^^^^^^
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 348, in load_holdings
self.holdings.load(isin = self.ISIN, secid = self.secid)
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 462, in load
percentages = [float(key.value[taxonomy[‘percent’]]) for key in value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “xxxxxxx\pp-portfolio-classifier\portfolio-classifier.py”, line 462, in
percentages = [float(key.value[taxonomy[‘percent’]]) for key in value]
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
KeyError: ‘percent’
The test was working, but if I run it against my portfolio I get the errors and no files ar written. Also isin2secid.json is not updated.
Any ideas?
Thank you for trying the script, @tommi296 . May I ask if you have any securities not tradeable in Europe or any type of “special” security? It is clear that I should add a better error logging since there’s is a quite large diversity of portfolios out there… The script was initially created having in mind European funds and ETFs, but I still have pending the modification to properly handle US securities.
I have “cleand-up” the portfolio I was runnning the scrip on, and now it works (no errors). Thanks.
Hi fizban, thanks a lot for the script. I did install it and if I try to run the testscript I got the following error message:
python3 portfolio-classifier.py test/multifaktortest.xml
Traceback (most recent call last):
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 6, in <module>
from jsonpath_ng import parse
ModuleNotFoundError: No module named 'jsonpath_ng'
Could you please have a look and let me know where the problem is?
Thanks a lot
Stephan
Did you install the requirements?
Yes I did.
pip3 install -r requirements.txt
And I upgraded pip to the newest version beforehand.
If I run it again I got:
Requirement already satisfied: Jinja2==2.11.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 1)) (2.11.2)
Requirement already satisfied: requests==2.24.0 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 2)) (2.24.0)
and more
The interesting part would obviously be that about jsonpath_ng.
Hi Chirlu,
here the complete output if I run:
"pip3 install -r requirements.txt"
Requirement already satisfied: Jinja2==2.11.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 1)) (2.11.2)
Requirement already satisfied: requests==2.24.0 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 2)) (2.24.0)
Requirement already satisfied: requests-cache==0.5.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 3)) (0.5.2)
Requirement already satisfied: jsonpath_ng==1.5.3 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 4)) (1.5.3)
Requirement already satisfied: markupsafe==1.1.1 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 5)) (1.1.1)
Requirement already satisfied: beautifulsoup4==4.9.3 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 6)) (4.9.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (1.25.11)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (2022.12.7)
Requirement already satisfied: six in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (1.16.0)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (5.1.1)
Requirement already satisfied: ply in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (3.11)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/site-packages (from beautifulsoup4==4.9.3->-r requirements.txt (line 6)) (2.4)
That is weird, since the output indicates that jsonpath_ng is already installed with an up-to-date version.
You can try forcing the reinstallation of that library with
pip3 install --force-reinstall jsonpath-ng
and see if anything changes…
Alternatively, you might have multiple versions of python in your system and in that case the best approach to force the right version of pip would be launching it through the default python3 interpreter:
python3 -m pip install -r requirements.txt
Hi fizban,
thanks for the hint. The command “python3 -m pip install -r requirements.txt” worked at the end.
However if I run: python3 portfolio-classifier.py test/multifaktortest.xml
I get back the following error:
python3 portfolio-classifier.py test/multifaktortest.xml
secid 0P00014E87 not found in PortfolioSAL (401) retrieving it from x-ray...
Traceback (most recent call last):
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 703, in <module>
pp_file.add_taxonomy(taxonomy)
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 561, in add_taxonomy
securities = self.get_securities()
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 663, in get_securities
security_h = security.load_holdings()
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
self.holdings.load(isin = self.ISIN, secid = self.secid)
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in load
percentages = [float(key.value[taxonomy['percent']]) for key in value]
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in <listcomp>
percentages = [float(key.value[taxonomy['percent']]) for key in value]
KeyError: 'percent'
Could you please let me know what would be the next step.
Thanks for your help.
BR Stephan
Good day,
with the command “python3 portfolio-classifier.py test/multifactortest.xml” I also get this error message. My system is linux mint
Traceback (most recent call last):
File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 703, in <module>
pp_file.add_taxonomy(taxonomy)
File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 561, in add_taxonomy
securities = self.get_securities()
File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 663, in get_securities
security_h = security.load_holdings()
File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
self.holdings.load(isin = self.ISIN, secid = self.secid)
File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 432, in load
response = resp.json()
File "/home/name/.local/lib/python3.10/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 525, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 373, in decode
raise JSONDecodeError("Extra data", s, end, len(s))
simplejson.errors.JSONDecodeError: Extra data: line 1 column 3 - line 3 column 46 (char 2 - 87)
Thank you @Kalli01, for trying the script. That error looks strange. It seems to point to an error raised by the simplejson library, but the script is not using that library directly. Looking at the code for the requests library, they seem to load simplejson if available and json if not. I can only suggest then to uninstall the simplejson library which seems to be less “permissive” than the standard json:
python3 -m pip uninstall simplejson
Thank you again, @bonsai213 for your patience and for trying the script. Indeed there is a security id in that xml for which morningstar does not provide percentages when using the etf endpoint. I still have to solve the etf vs fund endpoints but for now I just updated the script to handle that case and just continue after printing a message on the screen. Can you try again with the updated script?
thanks a lot. I will get the new version from github and try again.
I just tested the updated script. It is working with the test :
python portfolio-classifier.py test/multifaktortest.xml
If I run it with my PP.xml file, it is starting to work for many line items. After about 30 assets I get the following error:
secid 0P00007O1O not found in PortfolioSAL (500) retrieving it from x-ray...
No information on Asset-Type for 0P0001BKCQ
secid 0P0001BKCQ not found in PortfolioSAL (500) retrieving it from x-ray...
No information on Asset-Type for 0P0000EAQS
secid 0P0000EAQS not found in PortfolioSAL (500) retrieving it from x-ray...
Traceback (most recent call last):
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 706, in <module>
pp_file.add_taxonomy(taxonomy)
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 564, in add_taxonomy
securities = self.get_securities()
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 666, in get_securities
security_h = security.load_holdings()
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
self.holdings.load(isin = self.ISIN, secid = self.secid)
File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in load
if value[0].value.get(taxonomy['percent'],"") =="":
IndexError: list index out of range
Can you please have a look again.
Thanks for the tip @fizban . After I removed the simplejson your script works.
Your example xml and also my own xml file is converted without errors.
I uploaded a new version. As always, ‘experimental’, since there are a lot of portfolio variations out there. There are quite a few changes. Mainly:
- Now it tries to identify whether the security is an etf, fund or stock. If it is a stock, it skips it. If it is a fund or etf, it uses the corresponding morningstar end-point.
- The script is more verbose and shows more messages, to be able to better identify the current progress point.
- The default domain is now ‘de’. To retrieve securities only available in Spain, for example, it should be run at least once with -d es to be able to cache the security ids and their domains in the json file. Once all the securities are in the json file, it should not be important what domain is used with the -d parameter.
- The Json file now stores the security type and the domain. It seems that the domain is important to retrieve the bearer token required by the morningstar API, especially if a security is only available in one country.
- There should be less fall backs to the xray api.