Automatic import of classifications

Hi fizban, thanks a lot for the script. I did install it and if I try to run the testscript I got the following error message:

python3 portfolio-classifier.py test/multifaktortest.xml
Traceback (most recent call last):
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 6, in <module>
    from jsonpath_ng import parse
ModuleNotFoundError: No module named 'jsonpath_ng'

Could you please have a look and let me know where the problem is?

Thanks a lot

Stephan

Did you install the requirements?

Yes I did.

pip3 install -r requirements.txt

And I upgraded pip to the newest version beforehand.

If I run it again I got:

Requirement already satisfied: Jinja2==2.11.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 1)) (2.11.2)
Requirement already satisfied: requests==2.24.0 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 2)) (2.24.0)

and more

The interesting part would obviously be that about jsonpath_ng.

Hi Chirlu,

here the complete output if I run:

"pip3 install -r requirements.txt"

Requirement already satisfied: Jinja2==2.11.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 1)) (2.11.2)

Requirement already satisfied: requests==2.24.0 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 2)) (2.24.0)

Requirement already satisfied: requests-cache==0.5.2 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 3)) (0.5.2)

Requirement already satisfied: jsonpath_ng==1.5.3 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 4)) (1.5.3)

Requirement already satisfied: markupsafe==1.1.1 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 5)) (1.1.1)

Requirement already satisfied: beautifulsoup4==4.9.3 in /usr/local/lib/python3.10/site-packages (from -r requirements.txt (line 6)) (4.9.3)

Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (2.10)

Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (3.0.4)

Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (1.25.11)

Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests==2.24.0->-r requirements.txt (line 2)) (2022.12.7)

Requirement already satisfied: six in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (1.16.0)

Requirement already satisfied: decorator in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (5.1.1)

Requirement already satisfied: ply in /usr/local/lib/python3.10/site-packages (from jsonpath_ng==1.5.3->-r requirements.txt (line 4)) (3.11)

Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/site-packages (from beautifulsoup4==4.9.3->-r requirements.txt (line 6)) (2.4)

That is weird, since the output indicates that jsonpath_ng is already installed with an up-to-date version.
You can try forcing the reinstallation of that library with

pip3 install --force-reinstall jsonpath-ng

and see if anything changes…

Alternatively, you might have multiple versions of python in your system and in that case the best approach to force the right version of pip would be launching it through the default python3 interpreter:

python3 -m pip install -r requirements.txt
1 Like

Hi fizban,
thanks for the hint. The command “python3 -m pip install -r requirements.txt” worked at the end.
However if I run: python3 portfolio-classifier.py test/multifaktortest.xml

I get back the following error:

python3 portfolio-classifier.py test/multifaktortest.xml

secid 0P00014E87 not found in PortfolioSAL (401) retrieving it from x-ray...
Traceback (most recent call last):
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 703, in <module>
    pp_file.add_taxonomy(taxonomy)
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 561, in add_taxonomy
    securities = self.get_securities()
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 663, in get_securities
    security_h = security.load_holdings()
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
    self.holdings.load(isin = self.ISIN, secid = self.secid)
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in load
    percentages = [float(key.value[taxonomy['percent']]) for key in value]
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in <listcomp>
    percentages = [float(key.value[taxonomy['percent']]) for key in value]
KeyError: 'percent'

Could you please let me know what would be the next step.

Thanks for your help.

BR Stephan

Good day,

with the command “python3 portfolio-classifier.py test/multifactortest.xml” I also get this error message. My system is linux mint

Traceback (most recent call last):
  File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 703, in <module>
    pp_file.add_taxonomy(taxonomy)
  File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 561, in add_taxonomy
    securities = self.get_securities()
  File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 663, in get_securities
    security_h = security.load_holdings()
  File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
    self.holdings.load(isin = self.ISIN, secid = self.secid)
  File "/home/name/Downloads/pp-portfolio-classifier-main/portfolio-classifier.py", line 432, in load
    response = resp.json()
  File "/home/name/.local/lib/python3.10/site-packages/requests/models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 373, in decode
    raise JSONDecodeError("Extra data", s, end, len(s))
simplejson.errors.JSONDecodeError: Extra data: line 1 column 3 - line 3 column 46 (char 2 - 87)

Thank you @Kalli01, for trying the script. That error looks strange. It seems to point to an error raised by the simplejson library, but the script is not using that library directly. Looking at the code for the requests library, they seem to load simplejson if available and json if not. I can only suggest then to uninstall the simplejson library which seems to be less “permissive” than the standard json:
python3 -m pip uninstall simplejson

Thank you again, @bonsai213 for your patience and for trying the script. Indeed there is a security id in that xml for which morningstar does not provide percentages when using the etf endpoint. I still have to solve the etf vs fund endpoints but for now I just updated the script to handle that case and just continue after printing a message on the screen. Can you try again with the updated script?

@fizban,

thanks a lot. I will get the new version from github and try again.

@fizban

I just tested the updated script. It is working with the test :grinning::
python portfolio-classifier.py test/multifaktortest.xml

If I run it with my PP.xml file, it is starting to work for many line items. After about 30 assets I get the following error:

secid 0P00007O1O not found in PortfolioSAL (500) retrieving it from x-ray...
No information on Asset-Type for 0P0001BKCQ
secid 0P0001BKCQ not found in PortfolioSAL (500) retrieving it from x-ray...
No information on Asset-Type for 0P0000EAQS
secid 0P0000EAQS not found in PortfolioSAL (500) retrieving it from x-ray...
Traceback (most recent call last):
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 706, in <module>
    pp_file.add_taxonomy(taxonomy)
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 564, in add_taxonomy
    securities = self.get_securities()
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 666, in get_securities
    security_h = security.load_holdings()
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 348, in load_holdings
    self.holdings.load(isin = self.ISIN, secid = self.secid)
  File "/Users/stephan/pp-portfolio-classifier-main/portfolio-classifier.py", line 462, in load
    if value[0].value.get(taxonomy['percent'],"") =="":
IndexError: list index out of range

Can you please have a look again.

Thanks for the tip @fizban . After I removed the simplejson your script works.
Your example xml and also my own xml file is converted without errors.

I uploaded a new version. As always, ‘experimental’, since there are a lot of portfolio variations out there. There are quite a few changes. Mainly:

  • Now it tries to identify whether the security is an etf, fund or stock. If it is a stock, it skips it. If it is a fund or etf, it uses the corresponding morningstar end-point.
  • The script is more verbose and shows more messages, to be able to better identify the current progress point.
  • The default domain is now ‘de’. To retrieve securities only available in Spain, for example, it should be run at least once with -d es to be able to cache the security ids and their domains in the json file. Once all the securities are in the json file, it should not be important what domain is used with the -d parameter.
  • The Json file now stores the security type and the domain. It seems that the domain is important to retrieve the bearer token required by the morningstar API, especially if a security is only available in one country.
  • There should be less fall backs to the xray api.

@fizban Thanks a lot, with the new version the script runs successful without errors. :+1:

I have been sucesfull with importing the classifications. Really good I appreciate this import but now i did it the second time with (the first time with file A converted from the program into B now with B converted into file C).
C has now 2 times the classifications. It seems that an update it is ot possible. I guess this as there are limitation with the PP API´s.
The question is for me is now how I am able to copy the classification from File B Fund ABC to my original file where all the other stocks etc are. Is there any more efficient way than doing this manually?

By the way I recoginzed that I get some minor Error due to the fact that the calssification is importing 0,00 % in the classification.
Import_Error1
Import_Error2
This happens in all the classifications Country, Sector,…
thanks

Hi guys. I am a PP newby which doesnt understand anything from coding but is realy into classification.
Can someone use the opportunity to summerise the progress on the automatic classification feature. I think a lot of time and frustration could be safed for a lot of users and it could realy lift PP to another level.

Hi, this is an awesome tool and works as advertised!

I just wonder whether it would be possible to also scrap the classification from the shares in the portfolio - or import an existing classification of shares into the output file.

I have the problem that I have a ETF / shares mixed portfolio and always have to manually classify the shares. Or maybe I am doing something wrong?

Cheers and have a great weekend!

2 Likes

Hello maddhin,

good query. I’ve also asked for such a feature half a year ago.

@fizban And one more thing: The script works well, also if there are single securities/stocks included in the portfolio. Nevertheless it does not consider already existing/assigned classifications of these stocks in the the newly calculated assignments in the “pp_classified.xml” file.
Is there an option to make this possible?
So let’s say I have already assigned “Country–>UnitedStates” to a single security. Then it should be possible that this assignment is considered within the newly calculated classification “Country” as “Country–>UnitedStates”.