Automatic import of classifications

I had to remove -xr option. It was designed as a fallback in case that the main method of retrieval doesn’t provide results. However, it relied on Instant X-Ray which has been discontinued by Morningstar.

1 Like

ok; you may consider to update the description in Git HUb

Hi all, hi @Alfons1Qvor12,

I am looking for your feedback with the automatic import of classifications via JSON. With Version 0.78.0, you can export a taxonomy to JSON.

The option is in the upper right corner:

The format looks like like the one below - it has a block categoriesand a block instruments:

{
  "name": "Asset Classes",
  "categories": [
    {
      "name": "Cash",
      "key": "CASH",
      "color": "#c437c2"
    },
    {
      "name": "Equity",
      "key": "EQUITY",
      "color": "#5757ff",
      "children": [ ... ]
    }
  ],
  "instruments": [
    {
      "identifiers": {
        "name": "iShares Core Euro Government Bond UCITS ETF (Dist)",
        "isin": "IE00B4WXJJ64",
        "wkn": "A0RL83",
        "ticker": "EUNH.DE"
      },
      "categories": [
        {
          "key": "EQUITY",
          "path": [
            "Equity"
          ],
          "weight": 100.0
        }
      ]
    }
  ]
}

In the same format, one can import taxonomies.

This is the behavior when importing:

  • It will use the category “key” to find the category
  • If it cannot be found by key, then the import will try to find it by path
  • If it cannot be found neither by key nor path, a new category will be created - with the location based on the path
  • Instruments are searched first by ISIN, then by ticker, then by WKN, and finally by name
  • If the instrument is not found, no new instrument will be created (the idea is that one can apply a taxonomy to one owns set of instrument - not to import all instruments)
  • You can choose to not update name, description, and color when importing (the idea is that one might locally name the categories - country names - and does not want the import to overwrite the names every time - that makes sense when using the category “key” as identifier - say the two digit ISO identifier of the country)
  • Based on this: the “path” element in the “instruments” section is not needed if the key is given and all categories have keys. One also does not have to give all identifiers for an instrument (ISIN, ticker) but only the ones available.

My plan is to keep this format stable - once I updated it with your feedback. Then the script could generate a JSON file and allow an easier import into your “productive” PP file.


Where exactly?
If it is the funnel, it does not appear to be functioning.

The expot button is for csv.

1 Like

Ah, good catch. If you select the tree viewer (the first two views), the import/export should show up. I will need to add it to all views.

Thanks, now i have found it. Cool stuff!


Now everyone can write a script for their preferred data sources that updates the values in the exported json file.

1 Like

Hi @AndreasB,

Thanks for this new functionality. The format and the principle approach look good to me, but I also have a few concerns.

There is one thing in the format that I am missing: The sequence of categories is not reflected in the data. (I think it is rank in xml). (There is also no data about the sequence of instruments, but I don’t think it is as relevant). I am not sure, if it is really needed, but I see two possible use cases: 1) User exports taxonomy to json, deletes the taxonomy, and imports taxonomy again. It should look the same. (Maybe this is solved already by the sequence in the json file?). 2) A future version of portfolio-classifier modifies an exported json file and adds a new category (e.g. a new company name in taxonomy Holding). Currently, it is just added at the end. More generally, it would be nice to be able to modify the sequence of categories during import.

Now about a potential future use in context by portfolio-classifier.py: It would be a bit of work to adapt, but the advantage will be that it is decoupled from the actual format in which the PP files are stored. However, I have two concerns about the workflow for the user (which are much more relevant than the comment about rank/sequence above):

  1. It seems that export and import are for individual taxonomies only. Would it be possible to add a function for export and import of all taxonomies at once via a single json file? portfolio-classifier.py currenty supports ‘Asset Type’, ‘Stock Style’, ‘Stock Sector’, ‘Bond Style’, ‘Bond Sector’, ‘Region’, ‘Region (Bonds)’, ‘Country’, ‘Country (Bonds)’ and ‘Holding’. It would be very cumbersome to export and import them separately. (And portfolio-classifier.py would need to be able to handle all the separate files).
  2. If the user adds a new security, portfolio-classifier.py will currently not know about this new security by looking at the json file(s) only. As long as it has no classification, it does not show up in the json file(s). In such case, portfolio-classifier.py would still need to rely on PP file (which somehow destroys the biggest benefit of moving the script to the new JSON file). There are probably several options how to address this. One could be entries for instruments without any category in instruments block of taxonomy. Another might be a list of all instruments (without categories) at the beginning of this one big json file discussed above that contains all taxonomies together in one file.

Best regards,
A1Qv12

P.S.: What I like very much about the new interface: It will be possible to save, modify and create templates for how categories look like in PP. Even without any element in the instrument block, it is a very useful tool. (But also for this, being able to re-define the sequence of categories would be of much benefit).

1 Like

Two more thought:

  • The new interface for taxonomy import lacks the option to create a new taxonomy from scratch. (Also this could be achieved via a single JSON file for multiple taxonomies, but makes handling of course a bit more complex).
  • Removal / clean-up of categories doesn’t seem to be possible either. When working directly in the PP file, empty categories (like holdings in taxonomy Holding which are not included in any fund anymore because they changed name or have been removed from the fund) could in principle be removed. This new taxonomy import function however only adds new categories, but it is not able to delete obsolete ones.
1 Like

Good point. New categories are added at the end. Of course, when importing into a new taxonomy, the order is determined by the order in the JSON. However, when importing into an existing taxonomy, new categories are added at the end. I have to think about it - some kind of relative merging.

I could add all instruments to the export, but leave out the “categories” element if the instruments are not assigned.

I can add that. When importing such a “multi taxonomy file”, I would match the taxonomy based on name. Ok?

Right now, this is part of the sidebar menu. The little (+) next to the “Taxonomy” heading also has the “import taxonomy” option which is creating a new taxonomy based on the JSON. Maybe I should also put this into the main menu.

Good point. I could “prune” empty categories. Or I could say, delete categories not present in the import.

My thinking here is: Not all instruments will be part of an external source. Let’s say I have cash accounts, I have custom instruments for my collectibles, etc. I want to the user to manually assign categories for those items all the while importing the for “well-known” (let’s say exchange traded) instruments. Therefore it should be possible to create a category “Collectibles” which is kept when importing “asset types” officially.

Thanks for the great feedback. It will make this feature very useful.

2 Likes

Thanks a lot!

Some replies/thoughts:

  • Matching taxonomy by name is perfectly fine with me. You might want to consider using your key+name concept here as well. It would allow users to display other names of taxonomies than those created by the script (e.g. in German and not in English).
  • Thinking more about rank/sequence of categories, the most attractive use case for me is having a template (without instruments) with colours and the “right” sequence of e.g. countries or regions. This could be applied across multiple PP files and would always create the same look and feel of e.g. pie charts. Applying the template to existing taxonomies would require to also bring exsiting sequences into the “right” order of the template. (And I think that PP already has this rank attribute anyway e.g. in XML).
  • I agree that selecting the right securities for classification is challenging. The current portfolio-classifier.py script selects only active securties and those with an ISIN. So some additional attributes (like active/inactive) would be nice-to-have in the export.

wirklich super, vielen Dank. Wenn jemand eine gut Datenquelle kennt, bitte posten.

Ich habe es jetzt mal so gelöst das ich die Daten für ein ETF oder Instrument aus morningstar in eine CSV-Datei kopiere. Die Datei einhält in der Zeile 1 die ISIN und dann in Spalte 1 die Kategorien (z.B. Sektoren) und in Spalte 2 die %-Werte. Das Phyton-Skript liest die CSV und die Export-JSON-Datei aus PP mit meiner Struktur (müssen im selben Folder wie das Skript liegen, die Dateinamen spielen keine Rolle) und erstellt dann eine neue JSON-Datei mit den neuen Prozentwerten. Als Dateiname wird die ISIN verwendet. Diese kann dann ohne Problem in PP importiert werden. Skript kann ich gerne teilen

Am besten wäre natürlich eine download aller relevanten Titel direkt im JSON oder CSV Format. habe ich aber noch nicht gefunden.

Hallo Andreas,

es gibt noch ein unerwartetes Verhalten. Wenn ich mein Klassifikations-Struktur (z.B Sektoren) als JSON exportiere und die gleich Datei wieder importiere, werden an einigen Instrumenten Änderungen (Delete, Update, Create) vorgenommen. Wie ist das zu erklären?

Gute Frage. Kannst Du mir ein Beispiel nennen? Vielleicht habe ich da noch Fälle übersehen.

Hi,
I managed to run the whole script. The system generates the pp_classified file, which opens correctly but contains no classifications. On the command prompt, for all the ETFs I entered, I get a message like this. What am I doing wrong?
I am an Italian investor.

Retrieving data for fund LU0908500753 (0P0001SGE6) …
[Amundi Stoxx Europe 600 UCITS ETF Acc]
Warning: No information on Asset-Type for 0P0001SGE6 [400]
Warning: No information on Stock-style for 0P0001SGE6 [400]
Warning: No information on Sector for 0P0001SGE6 [400]
Warning: No information on Holding for 0P0001SGE6 [400]
Warning: No information on Region for 0P0001SGE6 [400]
Warning: No information on Country for 0P0001SGE6 [400]

Am besten ich schicke dir das JSON-Eport-File. Hast du einen anderen Weg als über das Forum?

Beispielsweise:

2 Likes

Ciao @Topkapi1971,

It seems that the corresponding web services of Morningstar are down. I don’t know, if this is temporary or not, but it seems that they are in the process of making major changes. So at least for the moment, my main branch of pp-portfolio-classifier doesn’t work properly.

Luckily, I have started a new branch for the script which uses a different API and which still works. That branch has a lot of additional features and I plan to make it the main branch sooner or later. Except for the documentation and for retrieval issues for a few less common stocks (for which the new branch also still relies on the service which is currently not working), the new-api-branch is anyway the better alternative.

So in short: Please use new-api-branch of Alfons1Qto12/pp-portfolio-classifier.

Thanks a lot for your prompt answer. I will test it asap. Where I can find it ?

@Topkapi1971 And the direct link to the new-API-branch is in: