[OT] XML spec notation question (Real Studio network user group Mailinglist archive)

Back to the thread list
Previous thread: Delay on first playsound
Next thread: p.s.


[OT] more Carbon shared library tales   -   Hadley, Joshua
  [OT] XML spec notation question   -   Thomas Reed
   Re: [OT] XML spec notation question   -   Joseph J. Strout
   Re: [OT] XML spec notation question   -   Thomas Reed
    Re: [OT] XML spec notation question   -   Brady Duga
     Re: [OT] XML spec notation question   -   Thomas Reed
   Re: [OT] XML spec notation question   -   Joseph J. Strout

[OT] XML spec notation question
Date: 13.12.02 21:07 (Fri, 13 Dec 2002 14:07:55 -0600)
From: Thomas Reed
I'm trying to wade through the XML specs, and I've got a question.
In the following notation:

([a-zA-Z0-9_.:] | '-')+

why is the dash character separate? Isn't the following notation equivalent:

([a-zA-Z0-9_.:-])+

Is there some subtle difference I'm missing? Or is the '-' character
for some reason not allowed inside the square brackets? I'm not
aware of that having any special meaning in regular expressions, but
I'm far from an expert there.

Thanks in advance!

Re: [OT] XML spec notation question
Date: 13.12.02 21:16 (Fri, 13 Dec 2002 12:16:43 -0800)
From: Joseph J. Strout
At 2:07 PM -0600 12/13/02, Thomas Reed wrote:
>I'm trying to wade through the XML specs, and I've got a question.
>In the following notation:
>
> ([a-zA-Z0-9_.:] | '-')+
>
>why is the dash character separate? Isn't the following notation equivalent:
>
> ([a-zA-Z0-9_.:-])+
>
>Is there some subtle difference I'm missing? Or is the '-'
>character for some reason not allowed inside the square brackets?
>I'm not aware of that having any special meaning in regular
>expressions, but I'm far from an expert there.

It does have special meaning, both in regular expressions, and in
this spec, as the example above demonstrates. It's used three times
before the one you're paying attention to. ;)

Cheers,
- Joe

Re: [OT] XML spec notation question
Date: 13.12.02 21:27 (Fri, 13 Dec 2002 14:27:56 -0600)
From: Thomas Reed
>It does have special meaning, both in regular expressions, and in
>this spec, as the example above demonstrates. It's used three times
>before the one you're paying attention to. ;)

Okay, I should have been more specific about the placement. It
doesn't have special meaning *at the end* that I'm aware of.

In fact, REALbasic's own RegEx documentation says that the string
"[a-c-]" will find a character in the range a-c or a '-' character.
So wouldn't the two forms in my original message be functionally
equivalent?

Re: [OT] XML spec notation question
Date: 13.12.02 22:20 (Fri, 13 Dec 2002 13:20:32 -0800)
From: Brady Duga

On Friday, December 13, 2002, at 12:27 PM, Thomas Reed wrote:

>> It does have special meaning, both in regular expressions, and in
>> this spec, as the example above demonstrates. It's used three times
>> before the one you're paying attention to. ;)
>
> Okay, I should have been more specific about the placement. It
> doesn't have special meaning *at the end* that I'm aware of.

Some implementations of regular expression matching give "-" special
meaning at the end of a set match, this is true. RB does, as does PERL
(no great surprise there). However, the grammar rules in the xml spec
are *not* the same regular expressions. These are expressions in the
right hand side of a rule in EBNF (Extended Backus-Naur Form). I don't
think EBNF allows this convenience, so it is explicitly spelled out. To
give it credit, BNF was designed to specify the ALGOL 60 programming
language. And yes, the "60" is the year it was released (as an update
to ALGOL 58).

--Brady
The La Jolla Underground

---
A searchable archive of this list is available at:
<http://dbserver.realsoftware.com/KBDB/search.php>

Unsubscribe:
<mailto:<email address removed>>

Subscribe to the digest:
<mailto:<email address removed>>

Re: [OT] XML spec notation question
Date: 13.12.02 23:27 (Fri, 13 Dec 2002 16:27:17 -0600)
From: Thomas Reed
>Some implementations of regular expression matching give "-" special
>meaning at the end of a set match, this is true. RB does, as does
>PERL (no great surprise there). However, the grammar rules in the
>xml spec are *not* the same regular expressions.

Okay, cool, so long as I'm understanding it properly.

Thanks for your help, guys!

Re: [OT] XML spec notation question
Date: 13.12.02 21:33 (Fri, 13 Dec 2002 12:33:49 -0800)
From: Joseph J. Strout
At 2:27 PM -0600 12/13/02, Thomas Reed wrote:

>Okay, I should have been more specific about the placement. It
>doesn't have special meaning *at the end* that I'm aware of.
>
>In fact, REALbasic's own RegEx documentation says that the string
>"[a-c-]" will find a character in the range a-c or a '-' character.
>So wouldn't the two forms in my original message be functionally
>equivalent?

Oh, well that's news to me. Probably the authors of the spec either
use a version of RegEx that doesn't support this, or just wanted to
be unambiguous about it.

Cheers,
- Joe