Regex validation, was Re: Programmers with network engineering skills
On Mon, Mar 12, 2012 at 9:18 PM, Mark Andrews <marka@isc.org> wrote:
Only if you don't properly quote/escape the arguments you are passing.
You're using your OS wrong if you are quoting/escaping the arguments. You do not need a shell involved to use fork() + exec() + wait(), as the shell is not involved (assuming Unix; I also suspect libc has a nice packaged function for this that is not insecure like system(), but it's not all that hard to roll your own). In Perl, use the multi-argument form of system(), not the single argument version(). In both cases you should clear the environment as well prior to the exec()/system() unless you know nobody can play with LD_PRELOAD, IFS, etc. This is one of my pet peeves about programming - programmers calling out to insecure functions when secure alternatives are available. The same goes for SQL statements - if you need to quote things to prevent SQL injection, you're using your SQL database wrong. Look up prepared statements. Generally, it's very bad practice to dynamically build SQL strings. It's also very common practice, hence why so many applications have SQL injection vulnerabilities. It's the Perl/PHP equivalent of the buffer overflow that simply wouldn't exist if developers, instead of trying to figure out how to quote everything, simply used prepared statements and placeholders. As for checking for bogus email addresses, read the RFC and code it right. That's not with a too-simple regex, nor is it with a complex regex. You need a parser, which is the right tool for the job. Regex is not. But there is value in not passing utter garbage to another program (it has a tendency to clog mail queues, if for no other reason) - just make sure you do it right. I might add that the same goes for names. People don't just have a first name and a last name - some people just have one name, some people have three or four names, some people have surnames with spaces, hypens, or apostrophes (remember what I said about SQL?!), etc. Yet most systems I see assume people have two names with no spaces, apostrophies, hyphens, etc. Big mistake. And don't get me started on addresses, which might have one address line, two address lines, even 5 address lines, to say nothing that international addresses may or may not put the "street" part first. It's certainly not easily regex-able. Okay, I'll step off the soap box and let the next person holler about how I was wrong about all this!
Joel Maslak wrote:
is not. But there is value in not passing utter garbage to another program (it has a tendency to clog mail queues, if for no other reason) - just make sure you do it right.
I fail to see why you wouldn't be able to throttle any abuse of your webform so it wouldn't clog a mail queue. Besides it's very hard to clog or otherwise overload an MTA, since it's purpose built to handle that kind of thing. I also fail to see why it would be so hard to install an MTA listening on localhost which sole purpose would be to validate email addresses and nothing else. And just dumps any possible outgoing email to /dev/null. If you're afraid of clogging the mail queue then only hand it off to the sending MTA after validation succeeded. But to be honest why would you care? MTAs are purpose made to handle such things. I can't really think of a scenario where validating an email address using a separate service would create such a performance bottleneck. If you have robots flooding your web forms 1000s of times a second (still peanuts for the average MTA) you need to rethink your security and abuse prevention...not your email validation...I would say. :-) People us a separate database instance for database queries, the database server has its own code to validate input. We don't code our own database server as part of the web form handling code. Why not hand of email validation the same way?
Okay, I'll step off the soap box and let the next person holler about how I was wrong about all this!
You're mostly right, but I disagreed about the email validation part. I just don't see a point in re-inventing the wheel when there are perfectly capable free alternatives that can do it for you with no noticeable performance penalty. Greetings, Jeroen -- Earthquake Magnitude: 4.8 Date: Saturday, March 17, 2012 01:49:29 UTC Location: Banda Sea Latitude: -7.0313; Longitude: 123.4175 Depth: 632.60 km
participants (2)
-
Jeroen van Aart
-
Joel Maslak