Source: Critters Writers' Workshop Blog

Critters Writers' Workshop Blog Really REALLY don't trust software code written by "chatbots" like Google's Bard AI. Eeek!

[Part II of Really really don't trust "chatbots" like Google's Bard AI. They lie. A lot. ] So, let's see how Bard does on writing software. I've received emails from Google urging me to try Bard to write software code. Literally, as in, "Try coding with Bard" and "Bard can now help you code" and "Collaborate with Bard to help generate, explain, and debug your code." Hey, that sounds pretty useful! Ok, I thought, if it lies about me and facts in general, how good is it at writing code? Ai-yi-yi. Summary: No no no no no no no no no. Danger Will Robinson! Do not use! Bard's code could kill people. tl;dr for coding geeks: Experiment #1: I first asked it to write a function in PERL to validate that an email address was syntactically correct. You know, "jsmith@somewhere.com" is valid, likewise "j.smith@some-where.co.uk"; whereas invalid email addresses would be things like "jsmith@", "jsmith@xyz.", j^smith@some!where!com", "jsmith@some@where.com", "^.#@%&", "j;;smith@@@" and stuff like that. The exact rules for a valid email address are a bit involved, as you can read here, but I figured if it got the general usage rules I'd be satisfied. For example, in front of the one and only '@' you usually see letters, numbers, and the most common punctuation, say _ . - +. After the '@' is stricter, and I'd be happy if it said it had to be letters, numbers, _ . - plus the rule that it has to have at least one '.' in the middle of some letters, and not at the start. That rules in and out most of the cases you see. In PERL, a regular expression (aka "wildcard pattern") for that would be something like: ^\w[\w.+-]*@[\w-]+(\.\w+)+$ (mini regex tutorial: \w is shorthand for any letter, number, or '_'. stuff in [] means any one of those characters. A '+' means one or more of the previous things, a '*' means zero or more of the previous. \. means a literal '.' (unless inside [], where you can do just a '.'). ^ and $ at start and end mean, "at start/end"; not in the middle. Complicated, I know.) So that regex means, starts with a letter/number/underscore, followed by any number of those or punctuation . + - followed by your regular old '@', followed by one or more letters/numbers/underscores/dashes, then at least one thing that starts with a '.' followed by one or more letters/numbers. (More or less.) So, jsmith@somewhere.com is in. j;;smith@@.wrong.j!!nk is out, and so on. It's not perfect, but it covers a lot of ground. Well, Bard did come up with a pattern that was not horrible: ^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z]+$ which says, in front of the '@' you have one or more letters/numbers/punctuation _ . + - (which is reasonable-ish) and after the '@' you have at least one letter/number/dash followed by a '.' followed by some letters. That covers a lot of cases, though my biggest quibble here would be it doesn't allow more than one '.' after the '@', so jsmith@mail.somewhere.com would fail when it shouldn't. But, that piece is almost ok-ish. The problem is what it did with that pattern. Here's Bard's code, with line numbers added to discuss it: 1 sub is_email_valid { 2 my $email = shift; 3 # Check if the email address is empty. 4 return 0 if $email eq ""; 5 # Check if the email address has the correct format. 6 /^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z]+$/ =~ /$email/; 7 return 1; 8 } and suggests this code to show how it works: 9 if (is_email_valid("johndoe@example.com")) { 10 print "The email address is valid."; 11 } else { 12 print "The email address is invalid."; 13 } Lines 1-4 are fine. It sets up a function, gets the email address, and says "nope" if it's empty. Sure. An empty address isn't valid. But in line 6 we run into problems. It has that regular expression pattern I mentioned above as being reasonable -- but it uses it wrong. It's used backwards. It won't work. It needs to be $email =~ /pattern/ NOT /pattern/ =~ /$email/ ! (You know, like 10/2 is not the same as 2/10.) The way Bard coded it, it treats the email address as the pattern, then asks if this convoluted fixed bit of text matches that. It won't do it. So, line 6 just won't match anything about an email address like you'd want. But that doesn't matter, in the end, since the last line, line 7, ignores whatever the test was in line 6 and simply returns "ok! that email is valid!". (The value "1" means "true.") So this function that looks... interesting... like it has some of ok-ish elements to it... just says anything is a valid email address, so long as it isn't entirely empty. Oy. Now, if this function to test email addresses has anything important behind the result, like, to avoid security breaches, Houston, you've got a problem. Then there's the purported test code, on line 9-13. Alas, even if the function worked correctly, there's a subtle mistake in this here test code too. It calls is_email_valid("johndoe@example.com"), which looks fine... except that in PERL, a thing like @example inside double quotes is an array variable name, and is replaced by the value of the array. In this specific case, there is no array named "example", so @example is empty, so this turns into a function call is_email_valid("johndoe.com"). (Which the wrongly written function says is valid, of course, so running the test code looks like the function worked! But ha! only because two errors compounded each other.) To fix that tiny problem, it should put a \ before the @ in this instance, or use single quotes. is_email_valid("johndoe\@example.com") or is_email_valid('johndoe@example.com') . Nitpicky, yeah, but that's programming. (There's the apocryphal story about a space probe lost because of a wrong comma.) Again, the main takeaway here is that (1) Bard can't code well and (2) it can look like it's right when it isn't. That makes very hard for a human "collaborator" to use. If it looks okay at a glance, but it's plain wrong? Ouch. If some junior programmer, or tired expert programmer, just copied Bard's code into some other software they're writing, even after glancing at it, it could lead to some very bad results. Even potentially deadly, as you'll see in Experiment 2. But first, let's try a different language for the email thing: Experiment #1b: I asked it to write that email validation function in the C language. It did even worse. It said... Sure, here is a C function to check if an email address is valid: 1 #include 2 #include 3 int is_email_valid(char *email) { 4 // Check if the email address is empty. 5 if (email == NULL || strlen(email) == 0) { 6 return 0; 7 } 8 // Check if the email address has the correct format. 9 char *at_pos = strchr(email, '@'); 10 if (at_pos == NULL) { 11 return 0; 12 } 13 char *dot_pos = strchr(at_pos + 1, '.'); 14 if (dot_pos == NULL) { 15 return 0; 16 } 17 // Check if the domain name is valid. 18 for (char *cur = dot_pos + 1; *cur != '\0'; cur++) { 19 if (!isalpha(*cur) && !isdigit(*cur)) { 20 return 0; 21 } 22 } 23 // The email address is valid. 24 return 1; 25 } 26 int main() { 27 // Get the email address from the user. 28 char email[100]; 29 printf("Enter an email address: "); 30 fgets(email, 100, stdin); 31 // Check if the email address is valid. 32 if (is_email_valid(email)) { 33 printf("The email address is valid.\n"); 34 } else { 35 printf("The email address is invalid.\n"); 36 } 37 return 0; 38 } Let's break down what it wrote. Lines 4-7 says it's an invalid email address (returns "0") if it's empty. Fine. Lines 8-12 says it's invalid if it doesn't have an '@' anywhere in it. Okay. Lines 13-16 says it's invalid if there is no '.' anywhere after the '@'. True, but there's a lot more to check than that. Kinda petty and wasteful to check just for that. Lines 17-22 says it's invalid if there is any character after that '.' that isn't a letter or number, and otherwise it's valid! yay! per line 24. Hrunh?!? So, it says user@.com is valid (nope) but it would reject "user@somewhere.co.uk" (after the first '.' is another '.', which isn't a letter/number). It would accept "!^#$\;@." or "user@&*!!.com", and so on and on. It's a horribly insufficient set of tests for a valid email address. Not even in the ballpark. Jeez, it could at least have called a regular expression function like the PERL example did. This is just awful. This is like trying to see if someone is allowed to withdraw money from you bank account by asking them, "do you have a gun in your left pocket? No, ok, is your name Jesse James? No? Well okay then, you're clear to empty out bank account #1234. Have fun!" Mindboggling. And, sure, there could be cases where a valid email address is important and really needs to be checked. But this code is non-obvious. You have to really know your C and really know your email address rules and really work your way through this function carefully to see how useless it is. On the surface it sounds so soothing. "Check if the email address has the correct format", it says, and "Check if the domain name is valid" and "The email address is valid." But it's complete BS. As I said, Bard lies. Don't trust it. Maybe it was just it didn't handle email address rules well... Experiment #2: So, I thought I'd see if it could handle something a little simpler, but also of potentially life-threatening consequences if it got it wrong. I asked it: write a perl function to characterize blood pressure as normal, elevated hypertension stage 1, hypertension stage 2, or hypertensive crisis It came back with "Sure": 1 sub characterize_blood_pressure {

Read full article »
Est. Annual Revenue
$5.0-25M
Est. Employees
25-100
CEO Avatar

CEO

Update CEO

CEO Approval Rating

- -/100