Ticket #27 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

Can't send e-mails to addresses with ' or %

Reported by: kmeirlaen Owned by: somebody
Priority: important Milestone: 1.0 RC2
Component: Other Version:
Keywords: Cc:

Description

I’m sure there are a lot of other valid e-mail addresses that just won’t work with QEmailServer, but these ones are actually semi-common, so we should definately support them.

I’ve got it to work by changing line 105 of QEmailServer.class.php to:
preg_match_all ("/[a-zA-Z0-9_.'%+-]+[@][\-a-zA-Z0-9_.]+/", $strAddresses, $strAddressArray);

Attachments

test_mail.php Download (4.4 KB) - added by marcosdsanchez 3 years ago.
test.output.2.txt Download (14.0 KB) - added by marcosdsanchez 3 years ago.
arpad3 passes all the tests
QEmailServer.class.php.patch Download (1.0 KB) - added by marcosdsanchez 3 years ago.
attaching new patch with arpad3 approach
27-Quick.patch Download (0.9 KB) - added by VexedPanda 3 years ago.
Patch suitable for 1.0.0, just expands original QCodo RegEx?. No strict checking.

Change History

  Changed 4 years ago by marcosdsanchez

  • status changed from new to in_QA

follow-up: ↓ 3   Changed 4 years ago by alex94040

Patch reviewed, feel free to check in.

Also confirmed that ' and % are indeed valid characters in email addresses ( http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx)

in reply to: ↑ 2   Changed 4 years ago by vakopian

Replying to alex94040:

Patch reviewed, feel free to check in.

Also confirmed that ' and % are indeed valid characters in email addresses ( http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx)

Guys, I don't think the original or the new expression is correct. This should not be done ad-hoc, just because the expression seams reasonable. There is a standard (RFC 2822) for valid email addresses. Even if we don't want to use the regex corresponding to the standards, there are better alternatives. See  http://www.regular-expressions.info/email.html

-Vartan

  Changed 4 years ago by VexedPanda

As said on that page, there's a trade-off between performance and correctness.
Regex really isn't designed for this (or for maintainability) either, so if we go with correctness, I'd rather see this used:
 http://code.google.com/p/php-email-address-validation/

What we have now though is pretty fast while being reasonably complete, so I have no qualms using it as-is.

  Changed 4 years ago by VexedPanda

Ok, that code's not good. I had a better link in IRC way back. I'll try and track it down again.

in reply to: ↑ 6   Changed 4 years ago by vakopian

Replying to VexedPanda:

Found it:
 http://www.pgregg.com/projects/php/code/validate_email.inc.phps
Comparison chart at:
 http://www.pgregg.com/projects/php/code/showvalidemail.php

How about adding an option to QEmailServer, so the user can decided which trade-off she wants to go with. We can have 2 or 3 simple pre-configured levels (fast, medium, strong). And as a catch all, we can make the reqexp settable, so if our canned choices are still not good, the user can put her own.

  Changed 4 years ago by VexedPanda

Considering the only complaints we've ever gotten are a couple e-mail addresses with 's, that seems like overkill.
I'd be ok adding a "strong" option using that PHP code though.

  Changed 4 years ago by alex94040

I agree with Vexed, the fancy approach seems like an overkill. If you folks feel that it's paramount that we go with the standard, go ahead and move the bug off of RC2.

  Changed 4 years ago by VexedPanda

Personally I think this patch should be applied to RC2, and we can create a new ticket to add "strong" support, if desired.

  Changed 3 years ago by basilieus

We should standardize everything if possible. ->> move to RC2

  Changed 3 years ago by basilieus

Sorry it is in RC2, sheesh man it's late here.. :)

  Changed 3 years ago by marcosdsanchez

The test is written to be executed in the command line.

  Changed 3 years ago by alex94040

Marcos - this is outright *amazing* work, thanks a bunch for doing this.

  Changed 3 years ago by VexedPanda

I'm guessing this is mostly the same results as seen at  http://www.pgregg.com/projects/php/code/showvalidemail.php

So I'm wondering why "vexed's php approach" doesn't match what's shown in that grid?

Changed 3 years ago by marcosdsanchez

Changed 3 years ago by marcosdsanchez

arpad3 passes all the tests

  Changed 3 years ago by marcosdsanchez

I think we should use arpad3 regexp as it passes all the tests and it's just a regexp, not a complicated filter with lots of lines of code.

Just my 2 cents....

  Changed 3 years ago by alex94040

Marcos - I agree, let's use arpad3 regular expression. Feel free to check in when you have a moment. Thanks again for your hard work on this!

  Changed 3 years ago by VexedPanda

I find it odd that a bunch of PHP developers prefer the RegEx? version over the PHP version.

That said, performance on the regex is about 10x faster than on the PHP version, so it's got my vote.

This full validation RegEx? is also only about 20% slower than the original patch, so I don't see a point in making it optional either.

  Changed 3 years ago by alex94040

Vexed - marcos and I are having trouble understanding what you meant by "This full validation RegEx?? is also only about 20% slower than the original patch, so I don't see a point in making it optional either."

Could you please elaborate?

  Changed 3 years ago by VexedPanda

Sorry, arpad3 is 20% slower than the original "Vexed's solution". This additional overhead seems acceptable to me to gain that extra accuracy.

  Changed 3 years ago by alex94040

Awesome, we're all on the same page, then. Marcos, please check in when you have a moment.

Changed 3 years ago by marcosdsanchez

attaching new patch with arpad3 approach

  Changed 3 years ago by marcosdsanchez

I added the patch with the arpad3 regexp, please review the patch before I commit it.

  Changed 3 years ago by alex94040

OK to check in. Thanks!!

  Changed 3 years ago by marcosdsanchez

  • status changed from in_QA to closed
  • resolution set to fixed

(In [44]) fixes #27

  Changed 3 years ago by vakopian

Folks,
The patch that was given here and applied to the code has 3 problems:

1) There is a typo(?) in the reg expression, it's not the correct arpad3 version (it's missing two back-slashes around the beginning);

2) It's split into multiple lines, which introduces spaces into the expression, which are significant in a regular expression. A reg expression should be given on one line (alternatively, if pretty formatting is really needed, we could use concatenation);

3) I think, this reg expression breaks the original logic of the QEmailServer. If I'm not mistaken, the original code was intended to take a string with many email addresses, extract valid addresses from it, and return it as an array. The new code (even when the 2 points above are fixed) works fine if only one email address is given, but fails when multiple addresses are given inside one string. You can test this by feeding a string with 2 simple addresses to the new routine.
I'm actually not yet sure how this point can be fixed. So I think this ticket should be re-opened (I don't have permission to reopen it myself).

-Vartan

  Changed 3 years ago by alex94040

  • status changed from closed to new
  • resolution fixed deleted

Reopening per Vakopian's request above.

  Changed 3 years ago by VexedPanda

Good catches. Thanks Vakopian!

3) I think this may simply be due to the at the beginning of the regex. Originally arpad3 wasn't meant to be used in preg_match_all.

  Changed 3 years ago by marcosdsanchez

Vakopian is right, the patch is not OK.

I can think of two approaches for this bug:

1) Use just one regex for emails concatenated with a ; (fast but we need another regexp that support concatenated emails).

2) separate the array (explode it) and validate each email (slow but arpad3 or any other email validation regex will work).

I would go for the second solution as it's easier.

I'm going on holidays tomorrow so I won't have time for this bug. Regex test is attached, please update it and execute it.

Thoughts?

  Changed 3 years ago by VexedPanda

Just realized my comment was mangled.
I think the multi-match is broken only because we're checking for "beginning of string" with the carrat symbol, which the tikceting system doesn't like. :P

  Changed 3 years ago by VexedPanda

Ok, I've played around with the regex, and still wasn't able to get it to do what I expected. I think we'd need a regex expert to make it suitable for use with preg_match_all.

My vote is we check in the original % and ' change, and leave strict e-mail checking until v1.1.0

Changed 3 years ago by VexedPanda

Patch suitable for 1.0.0, just expands original QCodo RegEx?. No strict checking.

  Changed 3 years ago by VexedPanda

  • status changed from new to in_QA

  Changed 3 years ago by alex94040

Vexed - agreed, let's do that. Please check in and open a new 1.1 bug, referencing this one, that would track making the regex mega-powerful.

  Changed 3 years ago by VexedPanda

  • status changed from in_QA to closed
  • resolution set to fixed

(In [66]) Fixes #27
Updates e-mail regex

Note: See TracTickets for help on using tickets.