Content Marketing

Crafting Excerpts in PHP or WordPress: Word, Sentence, and Paragraph Count Techniques

Creating excerpts in PHP is a common task in content management and website development. An excerpt is a shortened version of a longer piece of content, often used to provide a preview or summary. PHP developers might need to create excerpts based on word, sentence, or paragraph counts. This article explores methods to achieve this, along with best practices and handling cases where the count number exceeds the content length.

Excerpt by Word Count

Creating an excerpt by word count involves truncating the content after a certain number of words.

function excerptByWordCount($content, $wordCount) {
    $words = explode(' ', $content);
    if (count($words) > $wordCount) {
        $words = array_slice($words, 0, $wordCount);
        $content = implode(' ', $words);
    }
    return $content;
}

Usage:

// Excerpt of first 50 words
$wordCountExcerpt = excerptByWordCount($originalContent, 50); 

Best Practices and Handling Overcounts:

  • Check Word Count: Before truncating, check if the word count of the original content exceeds the desired excerpt length. If not, return the original content.
  • Avoid Breaking Words: Ensure the last word in the excerpt is complete to maintain readability.
  • Add an Ellipsis: Optionally, add an ellipsis (...) at the end if the content is truncated.

Excerpt by Sentence Count

Creating excerpts by sentence count involves keeping a certain number of sentences from the content.

function excerptBySentenceCount($content, $sentenceCount) {
    $sentences = explode('.', $content);
    if (count($sentences) > $sentenceCount) {
        $sentences = array_slice($sentences, 0, $sentenceCount);
        $content = implode('. ', $sentences) . '.';
    }
    return $content;
}

Usage

// Excerpt of first 3 sentences
$sentenceCountExcerpt = excerptBySentenceCount($originalContent, 3); 

To update the excerptBySentenceCount function to include sentences with any punctuation at the end (not just periods), you can modify the function to split the content by a regular expression that matches any typical sentence-ending punctuation, like a period, exclamation mark, or question mark. Here’s how you can do it in PHP:

function excerptBySentenceCount($content, $sentenceCount) {
    // Use a regular expression to split the content by sentence-ending punctuation
    $sentences = preg_split('/(?<=[.!?])\s+/', $content, -1, PREG_SPLIT_NO_EMPTY);

    if (count($sentences) > $sentenceCount) {
        $sentences = array_slice($sentences, 0, $sentenceCount);
        $content = implode(' ', $sentences);
        // Check the last character to ensure it ends with punctuation
        if (!preg_match('/[.!?]$/', $content)) {
            $content .= '.';
        }
    }
    return $content;
}

This function uses preg_split with a regular expression (regex) /(?<=[.!?])\s+/ which splits the text at spaces (\s+) that follow a period, exclamation mark, or question mark ([.!?]). The (?<=...) is a positive lookbehind assertion that checks for the presence of sentence-ending punctuation without including it in the split. The PREG_SPLIT_NO_EMPTY flag ensures that only non-empty pieces are returned.

Finally, the function checks if the last character of the resulting content is a sentence-ending punctuation. If not, it appends a period to maintain proper punctuation at the end of the excerpt.

Best Practices and Handling Overcounts:

  • Proper Sentence Detection: Use a period followed by a space to split sentences. This avoids splitting into periods used in abbreviations.
  • Check Sentence Count: Similar to word count, verify if the sentence count of the original content is sufficient.
  • Maintain Punctuation: Ensure the excerpt ends with proper punctuation, typically a period.

Excerpt by Paragraph Count

Creating excerpts by paragraph count involves truncating the content after a certain number of paragraphs.

function excerptByParagraphCount($content, $paragraphCount) {
    $paragraphs = explode("\n", $content);
    if (count($paragraphs) > $paragraphCount) {
        $paragraphs = array_slice($paragraphs, 0, $paragraphCount);
        $content = implode("\n", $paragraphs);
    }
    return $content;
}

Usage:

// Excerpt of first 2 paragraphs
$paragraphCountExcerpt = excerptByParagraphCount($originalContent, 2); 

Best Practices and Handling Overcounts:

  • Use New Lines for Paragraphs: Paragraphs are typically separated by new lines (\n). Ensure your content follows this format.
  • Check Paragraph Count: Validate if the paragraph count of the content is adequate for the excerpt.
  • Respect Content Structure: Maintain the structure of the paragraphs in the excerpt to preserve the content’s integrity.

Excerpt by HTML Paragraph Count

When dealing with HTML content, you’ll want to extract excerpts based on the <p> tags to maintain the structure and formatting of the original content.

function excerptByHtmlParagraphCount($content, $paragraphCount) {
    preg_match_all('/<p[^>]*>.*?<\/p>/', $content, $paragraphs);
    $paragraphs = $paragraphs[0];

    if (count($paragraphs) > $paragraphCount) {
        $paragraphs = array_slice($paragraphs, 0, $paragraphCount);
        $content = implode(' ', $paragraphs);
    }
    return $content;
}

Usage:

// Excerpt of first 2 paragraphs
$paragraphCountExcerpt = excerptByHtmlParagraphCount($htmlContent, 2); 

Best Practices and Handling Overcounts:

  • Regular Expressions for Tag Matching: Use preg_match_all with a regular expression to match <p> tags. This approach ensures that the structure and attributes of the paragraph tags are preserved.
  • Respect HTML Structure:
    Ensure that the excerpt maintains the HTML structure. Avoid breaking tags, which can lead to rendering issues.
  • Check Paragraph Count: As with plain text, verify if the paragraph count of the original content is sufficient for the excerpt.
  • Handle Nested Tags: Remember that paragraphs can contain other HTML elements like links or spans. Ensure your regex accounts for nested tags within paragraphs.

Creating excerpts based on HTML paragraph count in PHP is a more advanced task compared to handling plain text. It’s essential to use regular expressions carefully to maintain the integrity of the HTML structure. This method is especially relevant for web applications where the content needs to be displayed with its original formatting. As always, validate the length of the original content and consider user experience when presenting excerpts.

Yes, WordPress has its own set of functions and features that facilitate creating excerpts, which can greatly simplify the process compared to manually handling excerpts in PHP. Here’s an overview of the key WordPress functions related to excerpts:

The Excerpt Function in WordPress

The WordPress API offers a robust system for handling excerpts, making manually implementing PHP functions unnecessary for most typical use cases. WordPress provides a user-friendly way to manage post summaries, whether it’s customizing the length, changing the read more text, or using template tags to display excerpts.

the_excerpt()

This WordPress template tag automatically prints an excerpt for a post. It’s commonly used in themes to display a post summary on archive pages.

  • Usage: Place the_excerpt() within The Loop in your theme files where you want the excerpt to appear.
  • Behavior: By default, it shows the first 55 words of the post. If there’s a manually set excerpt in the post editor, it will display that instead.

get_the_excerpt()

This function retrieves the excerpt without displaying it, giving you more control over how and where to use it.

  • Usage: get_the_excerpt($post) can be used to fetch the excerpt of a specific post.
  • Customization: You can manipulate the returned string as needed before displaying it.

Customizing Excerpt Length

WordPress allows you to change the default excerpt length via the excerpt_length filter.

function custom_excerpt_length($length) {
    return 20; // Return 20 words as the new excerpt length
}
add_filter('excerpt_length', 'custom_excerpt_length');

Managing More Tag and Excerpt More Text

the_content('Read more')

This function displays the content until it encounters a “more” tag. It’s useful for showing a custom-length excerpt right within the content editor.

Customizing Excerpt More Text

You can customize the text that appears at the end of an excerpt (like […]) by using the excerpt_more filter.

function custom_excerpt_more($more) {
    return '...'; // Replace the default [...] with ...
}
add_filter('excerpt_more', 'custom_excerpt_more');

Handling HTML in Excerpts

WordPress excerpts are plain text by default. If you need to preserve HTML tags in excerpts, you must create a custom function or use a plugin designed for this purpose.

However, custom coding or plugins might be necessary for advanced requirements like preserving HTML tags in excerpts or creating excerpts based on specific elements like sentences or paragraphs.

Douglas Karr

Douglas Karr is CMO of OpenINSIGHTS and the founder of the Martech Zone. Douglas has helped dozens of successful MarTech startups, has assisted in the due diligence of over $5 bil in Martech acquisitions and investments, and continues to assist companies in implementing and automating their sales and marketing strategies. Douglas is an internationally recognized digital transformation and MarTech expert and speaker. Douglas is also a published author of a Dummie's guide and a business leadership book.

Related Articles

Back to top button
Close

Adblock Detected

Martech Zone is able to provide you this content at no cost because we monetize our site through ad revenue, affiliate links, and sponsorships. We would appreciate if you would remove your ad blocker as you view our site.