How to Update XML Files in Python?

Author: neptune | 01st-Jul-2024
#Python

XML (Extensible Markup Language) is widely used for representing structured data. Python provides robust libraries for handling XML files. This article covers how to update XML files in Python, focusing on updating element values, deleting elements, and handling empty or null values.

Libraries Used

The primary library used in this tutorial is `xml.etree.ElementTree`, which is part of Python's standard library.


 

   import xml.etree.ElementTree as ET


Sample XML File

Here is a sample XML file we'll be working with:


    <?xml version='1.0' encoding='utf-8'?>

    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"

        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">

        <url>

            <loc>https://neptuneworld.in/blog/rest-graphql-future-api-design</loc>

            <lastmod>2024-02-25</lastmod>

            <changefreq>always</changefreq>

            <priority>1.0</priority>

            <image:image>

                <image:loc>https://neptuneworld.in/media/static/website/blogs/GraphQL.JPG</image:loc>

            </image:image>

        </url>

    </urlset>




Updating Element Values

To update an element's value, you need to locate the element and set its text attribute to the new value.


    tree = ET.parse('sitemap.xml')

    root = tree.getroot()


    # Namespace dictionary

    namespaces = {

        'default': 'http://www.sitemaps.org/schemas/sitemap/0.9',

        'image': 'http://www.google.com/schemas/sitemap-image/1.1'

    }


    # Update the value of the <priority> element

    for priority in root.findall('.//default:priority', namespaces):

        priority.text = '0.8'


    tree.write('updated_sitemap.xml', encoding='utf-8', xml_declaration=True)




Deleting an Element

To delete an element, you need to find its parent first, and then remove the child element.


    tree = ET.parse('sitemap.xml')

    root = tree.getroot()


    # Find the parent element of <image:image> and remove the <image:image> child

    for url in root.findall('default:url', namespaces):

        image_elem = url.find('image:image', namespaces)

        if image_elem is not None:

            url.remove(image_elem)


    tree.write('updated_sitemap.xml', encoding='utf-8', xml_declaration=True)



Handling Empty or Null Values

To update empty or null elements, you can check if the element's text is `None` or an empty string before updating it.


    tree = ET.parse('sitemap.xml')

    root = tree.getroot()


    # Update <lastmod> element if it is empty or null

    for lastmod in root.findall('.//default:lastmod', namespaces):

        if lastmod.text is None or lastmod.text.strip() == '':

            lastmod.text = '2024-06-30'


    tree.write('updated_sitemap.xml', encoding='utf-8', xml_declaration=True)



How to Read Value

To read value from elements, you can check if the element's text is `None` before reading value then read the value else return not found.

    msgid_element = root.find('.//ns:MsgId', ns)

    if msgid_element is not None:

     msgid_value = msgid_element.text

     print(f'MsgId: {msgid_value}')

    else:

     print('MsgId element not found')


Complete Example

Here is a complete example that demonstrates updating an element value, deleting an element, and handling empty or null values.


    import xml.etree.ElementTree as ET


    # Load the XML file

    tree = ET.parse('sitemap.xml')

    root = tree.getroot()


    # Namespace dictionary

    namespaces = {

        'default': 'http://www.sitemaps.org/schemas/sitemap/0.9',

        'image': 'http://www.google.com/schemas/sitemap-image/1.1'

    }


    # Update the value of the <priority> element

    for priority in root.findall('.//default:priority', namespaces):

        priority.text = '0.8'


    # Find the parent element of <image:image> and remove the <image:image> child

    for url in root.findall('default:url', namespaces):

        image_elem = url.find('image:image', namespaces)

        if image_elem is not None:

            url.remove(image_elem)


    # Update <lastmod> element if it is empty or null

    for lastmod in root.findall('.//default:lastmod', namespaces):

        if lastmod.text is None or lastmod.text.strip() == '':

            lastmod.text = '2024-06-30'


    # Save the updated XML file

    tree.write('updated_sitemap.xml', encoding='utf-8', xml_declaration=True)



Conclusion

Handling XML files in Python is straightforward with the `xml.etree.ElementTree` module. You can update element values, delete elements, and handle empty or null values efficiently. This tutorial provides a foundation for manipulating XML files, allowing you to adapt these methods to more complex XML structures as needed.




Related Blogs
How to extract Speech from Video using Python?
Author: neptune | 16th-Jun-2023
#Python #Projects
Simple and easy way to convert video into audio then text using Google Speech Recognition API...

How to download video from youtube using python module ?
Author: neptune | 15th-Jun-2023
#Python
We will let you know how you can easily download the Youtube high quality videos along with subtitle, thumbnail, description using python package..

Best Python package manager and package for virtual environment ?
Author: neptune | 18th-Jun-2023
#Python #Pip
We will explore the options of Pip, Virtualenv, Anaconda, and also introduce Pyenv as a helpful tool...

Deploy Django project on AWS with Apache2 and mod_wsgi module.
Author: neptune | 18th-May-2024
#Python #Django
In this blog I use the AWS Ubuntu 18.22 instance as Hosting platform and used Apache2 server with mod_wsgi for configurations. We create a django sample project then configure server...

Core Python Syllabus for Interviews
Author: neptune | 26th-Jul-2023
#Python #Interview
STRING MANIPULATION : Introduction to Python String, Accessing Individual Elements, String Operators, String Slices, String Functions and Methods...

Mostly asked Python Interview Questions - 2023.
Author: neptune | 30th-May-2023
#Python #Interview
Python interview questions for freshers. These questions asked in 2022 Python interviews...

How to reverse string in Python ?
Author: neptune | 16th-May-2022
#Python
We are going to explore different ways to reverse string in Python...

Python Built-in functions lambda, map, filter, reduce.
Author: neptune | 15th-Jun-2023
#Python
We are going to explore in deep some important Python build-in functions lambda, map, filter and reduce with examples...

Python 3.9 new amazing features ?
Author: neptune | 26th-Jul-2023
#Python
Python 3.9 introduces new features such as dictionary union, string methods to remove prefixes and suffixes, type hinting, and speed improvements for built-in functions...

5 Languages that Replace Python with Proof
Author: neptune | 13th-Apr-2023
#Python
Julia, Rust, Go, Kotlin, and TypeScript are modern languages that could replace Python for specific use cases...

10 Proven Ways to Earn Money Through Python
Author: neptune | 11th-Apr-2023
#Python
Python offers numerous earning opportunities from web development to teaching, data analysis, machine learning, automation, web scraping, and more...

Monkey Patching in Python: A Powerful Yet Controversial Technique
Author: neptune | 01st-Aug-2023
#Python
Monkey patching in Python is a dynamic technique to modify code at runtime. It can add/alter behavior, but use it judiciously to avoid maintainability issues...

Building a Simple Chatbot with Python and openpyxl
Author: neptune | 25th-Jun-2024
#Python #Projects
This chatbot reads questions and answers from an Excel file and provides responses based on user input...

How to Ensure Proper Namespace Handling in XML with Python's lxml Library
Author: neptune | 01st-Jul-2024
#Python
By using `lxml`, you can effectively manage XML namespaces and ensure that your XML structure remains intact during updates...

Best Practices for Managing Requests Library Sessions When Interacting with Multiple APIs ?
Author: neptune | 22nd-Aug-2024
#Python
When working with Python's `requests` library, managing sessions is crucial, especially when your application interacts with multiple APIs...

View More