Efficiently getting XML into Elasticsearch

Converting XML to JSON is rather question about understanding actual data in XML, as it can be not so easy to transform to JSON and usually needs additional logic. For this reason, there’s no error-proof XML>JSON translators.

If you’ll decide to use python to do that, take a look at eTree, lxml and xmltodict. JSON support is in python‘s stdlib natively.

If you’ll decide to try some luck from ES side, look at elasticsearch-xml. It may fit your needs in case of consistent XML.

Talking about python vs java performance for parsing – if performance is a key for you, you can leverage some libraries, that is already optimized at low-level, but generally, good java code should perform better.

Read more here: Source link