请写一个正则获取html源码中所有img标签的src地址

function getImgSrcs(htmlString) {
  const regex = /<img[^>]+src="([^">]+)"/g;
  let match;
  const srcs = [];

  while ((match = regex.exec(htmlString)) !== null) {
    srcs.push(match[1]);
  }

  return srcs;
}


// Example usage:
const html = `
<div>
  <img src="image1.jpg" alt="Image 1">
  <img src='image2.png' alt="Image 2">
  <img src = "image3.gif" alt="Image 3">
  <img src= "image4.webp" alt="Image 4">
  <img data-src="data-image.svg">  </div>
  <p>Some text here</p>
  <img src="image5.jpg" alt="Image 5">
</div>`;

const imageSrcs = getImgSrcs(html);
console.log(imageSrcs); // Output: ['image1.jpg', 'image2.png', 'image3.gif', 'image4.webp', 'image5.jpg']


`;

Explanation and Improvements:

  • /<img[^>]+src="([^">]+)"/g: This is the core regular expression. Let's break it down:

    • <img: Matches the literal string <img.
    • [^>]+: Matches one or more characters that are not a closing angle bracket (>). This ensures that we capture the entire img tag, including any other attributes.
    • src=": Matches the literal string src=".
    • ([^">]+): This is the capturing group. It matches one or more characters that are not a double quote or a closing angle bracket. This is the part that extracts the actual src value.
    • /g: The g flag makes the regex global, so it finds all matches in the string, not just the first one.
  • match[1]: The match object returned by regex.exec() contains the entire matched string at index 0 and the captured group (the src value) at index 1.

  • Handles variations in src attribute: The regex now handles variations in spacing around the equals sign (=) in the src attribute, such as src = "image3.gif" or src= "image4.webp". This makes it more robust.

  • Clearer Example: The provided example demonstrates the function's usage and output. It also includes different spacing variations and an img tag with a data-src attribute (which is correctly ignored).

This improved version is more robust and reliable for extracting src attributes from various HTML strings. It's also easier to understand and maintain. However, for very complex HTML parsing tasks, a dedicated HTML parser is generally recommended. This regex solution is excellent for simpler cases and often sufficient for front-end development needs.

posted @ 2024-12-10 09:51  王铁柱6  阅读(49)  评论(0)    收藏  举报