In this article we will see how to read and manipulate XML data in Java . There are many inbuilt as well as external API’s for reading and manipulating XML data in Java but we will be using the inbuilt DOM Parser and XPath API to do this.
Below is the XML which we will be updating
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Jobs>
<Job id="0">
<position>Data Analyst</position>
<skill>Python</skill>
<vacancies>3</vacancies>
</Job>
<Job id="2">
<position>Developer</position>
<skill>CSS</skill>
<vacancies>8</vacancies>
</Job>
<Job id="3">
<position>Developer</position>
<skill>SpringBoot</skill>
<vacancies>1</vacancies>
</Job>
</Jobs>
Below will be the output XML after manipulation
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Jobs>
<Job id="1">
<position>Data Analyst</position>
<skill>Python</skill>
<vacancies>4</vacancies>
</Job>
<Job id="2">
<position>Developer</position>
<skill>CSS</skill>
<vacancies>9</vacancies>
<salary>100K</salary>
</Job>
<Job id="3">
<position>Developer</position>
<skill>SpringBoot</skill>
<vacancies>2</vacancies>
<salary>100K</salary>
</Job>
</Jobs>
public class AppMain {
public static void main(String[] args) {
String sourcefilePath="D:\\Developers.xml";
String destinationfilePath="D:\\Developers_updated.xml";
File xmlFile=new File(sourcefilePath);
try {
DocumentBuilderFactory documentFactory=DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder=documentFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFile);
//remove unwanted white spaces and reduce redundancies
document.getDocumentElement().normalize();
//using xPath API to query XML
XPath xPath = XPathFactory.newInstance().newXPath();
//XPath expression to get the list of Job nodes inside the root Job node
NodeList jobList = (NodeList) xPath.compile("/Jobs/Job").evaluate(document, XPathConstants.NODESET);
//update attribute value for a node with id=0 to id=1
for(int i=0;i<jobList.getLength();i++) {
Node idAttribute = jobList.item(i).getAttributes().getNamedItem("id");
if(idAttribute.getTextContent().equalsIgnoreCase("0")) {
idAttribute.setTextContent("1");
}
}
NodeList vacancies = (NodeList) xPath.compile("/Jobs/Job/vacancies").evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < vacancies.getLength(); i++) {
Node vacancy = vacancies.item(i).getFirstChild();
int newVacancy=Integer.parseInt(vacancy.getNodeValue())+1;
vacancy.setTextContent(String.valueOf(newVacancy));
}
//create new element salary and insert to all
for(int i=0;i<jobList.getLength();i++) {
Element salary = document.createElement("salary");
salary.appendChild(document.createTextNode("100K"));
jobList.item(i).appendChild(salary);
}
//to remove salary tag from the first job
NodeList childNodes = jobList.item(0).getChildNodes();
for(int j=0;j<childNodes.getLength();j++) {
if(childNodes.item(j).getNodeName().equalsIgnoreCase("salary")) {
//jobList.item(0) is the <Job> node and pass the child node to be removed
jobList.item(0).removeChild(childNodes.item(j));
}
}
// write the content back into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "8");
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new File(destinationfilePath));
transformer.transform(source, result);
System.out.println("XML File update completed");
} catch (SAXException | ParserConfigurationException | IOException | TransformerException | XPathExpressionException e) {
e.printStackTrace();
}
}
}
DOM Parser loads the file into memory and XPath allows various querying options to fetch required nodes through regular expressions . XPath is a powerful API that allows you to query data not only by the nodes or attribute tag names but also by the content/value of those nodes or attribute names.
DOM Parser may not be suitable for parsing huge files because it loads the entire document into memory . In such cases the alternatives are to use SAX parse , JAXB etc which have their own advantages and disadvantages.
Feel free to leave questions in the comments.