Uploaded image for project: 'XMLBeans'
  1. XMLBeans
  2. XMLBEANS-637

Combine same contiguous element types incorrectly while generating XSD from an XML instance

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Version 3.0.1, Version 5.1.0
    • Version 5.2.0
    • Cursor
    • None

    Description

      Step to reproduce

      1- Using this given XML instance to generate an XSD schema with XMLBeans v5.1.0 (or v3.0.1). Please note that there are two contiguous <Result> nodes in the XML document.

      <data>
          <Code>6065</Code>
          <LocNum>6065</LocNum>
          <StockNum>23123191</StockNum>
          <Vin>1C4NJRFB4GD618747</Vin>
          <YearCode>g</YearCode>
          <MakeCode>JE</MakeCode>
          <ModelCode>PATR</ModelCode>
          <TrimCode>HIAL</TrimCode>
          <BodyCode>S006</BodyCode>
          <EngineCode>0024</EngineCode>
          <FuelType>G</FuelType>
          <TransCode>A</TransCode>
          <ClassCode>80</ClassCode>
          <Color>100</Color>
          <IntColor>2392</IntColor>
          <PrevProdStatus>525</PrevProdStatus>
          <ProdStatus>520</ProdStatus>
          <Mileage>33333</Mileage>
          <LastLotDate>2022-12-12T05:12:53.826-04:00</LastLotDate>
          <LastLotAssign>LB5</LastLotAssign>
          <Result>
              <child-result>test1</child-result>
          </Result>
          <Result>
              <child-result>test2</child-result>
          </Result>
      </data> 

       

      2- Try using this snippet code to generate the XSD schema from the above XML instance

      public static void main(String[] args) {
          try {
              XmlObject[] xmlInstances = new XmlObject[1];
              xmlInstances[0] = XmlObject.Factory.parse(new String(Files.readAllBytes(Paths.get("path_to_the_xml_file"))));
      
              Inst2XsdOptions inst2XsdOptions = new Inst2XsdOptions();
              inst2XsdOptions.setDesign(Inst2XsdOptions.DESIGN_RUSSIAN_DOLL);
              inst2XsdOptions.setUseEnumerations(Inst2XsdOptions.ENUMERATION_NEVER);
              inst2XsdOptions.setSimpleContentTypes(Inst2XsdOptions.SIMPLE_CONTENT_TYPES_SMART);
      
              SchemaDocument[] schemaDocuments = Inst2Xsd.inst2xsd(xmlInstances, inst2XsdOptions);
              if (schemaDocuments != null && schemaDocuments.length > 0) {
                  System.out.println(schemaDocuments[0].toString());
              }
          } catch (Exception e) {
              e.printStackTrace();
          }
      } 

      Expected Result:

      In the output XSD schema, the element Result should be an array (maxOccurs="unbounded" minOccurs="0")

      <schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema">
        <element name="data">
          <complexType>
            <sequence>
              <element type="xs:short" name="Code" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="LocNum" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:int" name="StockNum" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="Vin" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="YearCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="MakeCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="ModelCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="TrimCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="BodyCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="EngineCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="FuelType" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="TransCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="ClassCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="Color" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="IntColor" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="PrevProdStatus" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="ProdStatus" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:int" name="Mileage" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:dateTime" name="LastLotDate" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="LastLotAssign" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element name="Result" maxOccurs="unbounded" minOccurs="0">
                <complexType>
                  <sequence>
                    <element type="xs:string" name="child-result" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
                  </sequence>
                </complexType>
              </element>
            </sequence>
          </complexType>
        </element>
      </schema>

      Actual Result:

      The element Result is not an array.

      <schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema">
        <element name="data">
          <complexType>
            <choice maxOccurs="unbounded" minOccurs="0">
              <element type="xs:short" name="Code" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="LocNum" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:int" name="StockNum" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="Vin" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="YearCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="MakeCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="ModelCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="TrimCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="BodyCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="EngineCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="FuelType" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="TransCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="ClassCode" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:byte" name="Color" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="IntColor" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="PrevProdStatus" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:short" name="ProdStatus" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:int" name="Mileage" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:dateTime" name="LastLotDate" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element type="xs:string" name="LastLotAssign" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
              <element name="Result">
                <complexType>
                  <sequence>
                    <element type="xs:string" name="child-result" xmlns:xs="http://www.w3.org/2001/XMLSchema"/>
                  </sequence>
                </complexType>
              </element>
            </choice>
          </complexType>
        </element>
      </schema>

      My Investigating Info

      Below is information I found while looking for the answer to why this happens.

       

      While parsing the input XML instance and calling getName() method in QNameCache.class, the first Result node is added to the table right before the table's size reaches the threshold. The first Result node is allocated to a new memory address as the attached image.

      Then, the class executes the rehash() method to increase the size of the table to receive more incoming nodes.

      Next, the last Result node is added to the table, but it is allocated to a separate memory address instead of referring to the first Result node (please note that their URI, localName, and prefix are exactly the same )

       

      After that, in RussianDollStrategy.class, the method processElementsInComplexType() compares those two Result QName by using the == operator to check if they are the same contiguous elements.

      The == operator checks whether objects are identical or not. In this case, it returns false as those two Result QName objects are located in different memory addresses, and the consequence is it does not combine the element type.

       

      I think this case should be covered by adding one more condition to compare their namespaceURI, localPart, and prefix.

      else if (currentElem.getName() == child.getName() || currentElem.getName().equals(child.getName()) 

       

       

       

       

       

      Attachments

        1. image-2023-05-30-15-00-38-785.png
          267 kB
          Ronan
        2. image-2023-05-30-15-09-05-151.png
          154 kB
          Ronan
        3. image-2023-05-30-15-40-15-343.png
          33 kB
          Ronan

        Activity

          People

            Unassigned Unassigned
            ronan.phuc.nguyen Ronan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m