Skip to content

Add xml.xsd to lexicon schema set#2813

Merged
Tkael merged 1 commit into
EDCD:developfrom
klightspeed:add-xml-xsd
May 16, 2026
Merged

Add xml.xsd to lexicon schema set#2813
Tkael merged 1 commit into
EDCD:developfrom
klightspeed:add-xml-xsd

Conversation

@klightspeed
Copy link
Copy Markdown
Contributor

@klightspeed klightspeed commented May 15, 2026

.NET 5.0 and above disable remote XML schema downloads by default, resulting in lexicon schema validation failing due to the XML namespace schema (http://www.w3.org/XML/1998/namespace) not being present in the available schemas.

Add a cached copy of https://www.w3.org/2009/01/xml.xsd (the current immutable version of the schema at http://www.w3.org/2001/xml.xsd) to the lexicon schema set.

Reported as #2814

Error reported in EDCD#eddi Discord:

Hi, my lexicon file, which used to work, is now getting renamed to malformed.
I get the following error in the log file:

2026-05-15T10:07:46.2462358Z [16] [Warning] SpeechFormatter:IsValidXML Could not load lexicon file '%APPDATA%\EDDI\lexicons\en.pls', please review.: {"ClassName":"System.Xml.Schema.XmlSchemaValidationException","Message":"The 'http://www.w3.org/XML/1998/namespace:base' attribute is not declared.","Data":null,"InnerException":null,"HelpURL":null,"StackTraceString":"   at System.Xml.Schema.XmlSchemaValidator.SendValidationEvent(XmlSchemaValidationException e, XmlSeverityType severity)\r\n   at System.Xml.Schema.XmlSchemaValidator.RecompileSchemaSet()\r\n   at System.Xml.Schema.XmlSchemaValidator.Init()\r\n   at System.Xml.Schema.XNodeValidator.Validate(XObject source, XmlSchemaObject partialValidationType, Boolean addSchemaInfo)\r\n   at EddiSpeechService.SpeechPreparation.SpeechFormatter.IsValidXML(String filename) in EddiSpeechService\SpeechPreparation\SpeechFormatter.cs:line 277","RemoteStackTraceString":null,"RemoteStackIndex":0,"ExceptionMethod":null,"HResult":-2146231999,"Source":"System.Private.Xml","WatsonBuckets":null,"res":"The '{0}' attribute is not declared.","args":["http://www.w3.org/XML/1998/namespace:base"],"sourceUri":"","lineNumber":69,"linePosition":14,"version":"2.0"}

I get what looks like the same error if I use the example apple to orange lexicon from here https://github.com/EDCD/EDDI/wiki/Lexicons
I think this is probably easy to fix by editing my lexicon file, but I don't understand the problem well enough to fix it. I thought I'd metion here as it feels like the example file will need the same fix.
I thiiink this started happening when I updated EDDI to 5.0, but I'm not certain.

Example program showing underlying issue:

using System.Diagnostics;
using System.Security.Cryptography;
using System.Xml.Linq;
using System.Xml.Schema;

var pls = """
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-gb">
  <lexeme>
    <grapheme>apple</grapheme>
    <phoneme>ˈɔɹɪnd͡ʒ</phoneme>
  </lexeme>
</lexicon>
""";

static async Task AddRemoteSchema(HttpClient client, XmlSchemaSet xmlschemas, string url, string targetns, string sha256sum)
{
    if (!xmlschemas.Contains(targetns))
    {
        var schemaData = await client.GetByteArrayAsync(url);
        Debug.Assert(Convert.ToHexStringLower(SHA256.HashData(schemaData)) == sha256sum);
        using var memstream = new MemoryStream(schemaData);
        var schema = XmlSchema.Read(memstream, null);
        Debug.Assert(schema != null);
        Debug.Assert(schema.TargetNamespace == targetns);
        xmlschemas.Add(schema);
    }
}

var xmlschemas = new XmlSchemaSet();

using var client = new HttpClient();

await AddRemoteSchema(
    client,
    xmlschemas,
    "https://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd",
    "http://www.w3.org/2005/01/pronunciation-lexicon",
    "5c15b4a8862e5232512bc26ca6359f0c679f0059515dc66af8ae8be125a74061"
);

if (args.Contains("--import-xml-schema"))
{
    await AddRemoteSchema(
        client,
        xmlschemas,
        "https://www.w3.org/2009/01/xml.xsd",
        "http://www.w3.org/XML/1998/namespace",
        "cc701736c42cc64126fad063bb95f94484b5de3b5f808a86ea098b0957aff829"
    );
}

var xml = XDocument.Parse(pls);

xml.Validate(xmlschemas, (o, e) =>
{
    if (e.Severity is XmlSeverityType.Error or XmlSeverityType.Warning)
    {
        throw new XmlSchemaValidationException(e.Message, e.Exception);
    }
});

Console.WriteLine("OK");
klightspeed@Capella MINGW64 ~/source/repos/ConsoleApp1/app
$ dotnet run Program.cs
Unhandled exception. System.Xml.Schema.XmlSchemaValidationException: The 'http://www.w3.org/XML/1998/namespace:base' attribute is not declared.
   at System.Xml.Schema.XmlSchemaValidator.SendValidationEvent(XmlSchemaValidationException e, XmlSeverityType severity)
   at System.Xml.Schema.XmlSchemaValidator.RecompileSchemaSet()
   at System.Xml.Schema.XmlSchemaValidator.Init()
   at System.Xml.Schema.XNodeValidator.Validate(XObject source, XmlSchemaObject partialValidationType, Boolean addSchemaInfo)
   at Program.<Main>$(String[] args)
   at Program.<Main>(String[] args)

klightspeed@Capella MINGW64 ~/source/repos/ConsoleApp1/app
$ dotnet run Program.cs --import-xml-schema
OK

xml.xsd should have blob hash bd291f3d4be818edcb1498697ffd03f0226a9cf8:

$ IFS='' read -r -d '' xml_xsd < <(curl -s https://www.w3.org/2009/01/xml.xsd); printf "%s" "$xml_xsd" | sha256sum; printf "blob %d\0%s" "$(printf "%s" "$xml_xsd" | wc -c)" "$xml_xsd" | sha1sum
cc701736c42cc64126fad063bb95f94484b5de3b5f808a86ea098b0957aff829  -
bd291f3d4be818edcb1498697ffd03f0226a9cf8  -
$ curl -s -L -H "Accept: application/vnd.github.object" -H "X-GitHub-Api-Version: 2026-03-10" https://api.github.com/repos/klightspeed/EDDI/contents/SpeechService/Properties/xml.xsd?ref=add-xml-xsd | jq '.sha'
"bd291f3d4be818edcb1498697ffd03f0226a9cf8"

@klightspeed klightspeed changed the title Add xml.xsd to lexicon schema set [#2814] Add xml.xsd to lexicon schema set May 15, 2026
@klightspeed klightspeed changed the title [#2814] Add xml.xsd to lexicon schema set Add xml.xsd to lexicon schema set #2814 May 15, 2026
@klightspeed klightspeed changed the title Add xml.xsd to lexicon schema set #2814 Add xml.xsd to lexicon schema set May 15, 2026
@klightspeed klightspeed marked this pull request as ready for review May 15, 2026 15:38
.NET 5.0 and above disable remote XML schema downloads by default,
resulting in lexicon schema validation failing due to the XML namespace
schema (http://www.w3.org/XML/1998/namespace) not being present in the
available schemas.

Add a cached copy of https://www.w3.org/2009/01/xml.xsd (the current
immutable version of the schema at http://www.w3.org/2001/xml.xsd) to
the lexicon schema set.
@klightspeed
Copy link
Copy Markdown
Contributor Author

The original pull commit had a typo - Eddn instead of Eddi. This has been fixed.

Testing:

  • Add a en.pls mapping Adder to ˈɔɹɪnd͡ʒ
  • Set the startup project to Eddi
  • Place a breakpoint on xml.Validate(...
  • Comment out FetchSchemasFromResource( "EddiSpeechService.Properties.xml.xsd" );
  • Debug the project
  • On the Text-to-Speech tab, select Microsoft David, ship Adder, and click Test voice
  • Verify that I get the validation error
  • Speech continues with "This is how I will sound in your Adder"
  • Exit EDDI
  • Uncomment FetchSchemasFromResource( "EddiSpeechService.Properties.xml.xsd" );
  • Debug the project
  • On the Text-to-Speech tab, select Microsoft David, ship Adder, and click Test voice
  • Verify that validation succeeds
  • Verify that I hear "This is how I will sound in your Orange"

@Tkael Tkael merged commit 5726f6e into EDCD:develop May 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants