Text Summarizer Project (Code) in Python

Text Summarizer in Python.

Visit pythonforbiginners.com to discover python tutorials

About the project: This is a Text Summarizer project in Python.

This will be a standalone script that summarizes a block of text using a simple, frequency-based algorithm.

This program will take a text input, analyze the frequency of words, and then select the most important sentences to create a summary.

How to use this program:

To use this Text Summarizer, simply save the code as a Python file and run it from your terminal:

Save the code: Save the code as a Python file (e.g., text_summarizer.py).

Run the script: Open your terminal, navigate to the directory where you saved the file, and run python text_summarizer.py.

The program will prompt you to enter your text. Once you are done, type q or quit on a new line and press Enter.

You will then be asked for the desired length of the summary in sentences.

The program will then display the summarized text.

This summarizer is a basic implementation and works best on well-structured text.

Project Level: Intermediate

You can directly copy the below snippet code with the help of green copy button, paste it and run it in any Python editor you have.

Steps: Follow these steps

Step 1: Copy below code using green 'copy' button.

Step 2: Paste the code on your chosen editor.

Step 3: Save the code with filename and .py extention.

Step 4: Run (Press F5 if using python IDLE)



# text_summarizer.py

import re
from collections import defaultdict
from heapq import nlargest

def summarize_text(text, num_sentences=3):
    """
    Summarizes a block of text using a simple frequency-based method.
    
    Args:
        text (str): The full text to be summarized.
        num_sentences (int): The number of sentences to include in the summary.
                             Defaults to 3.

    Returns:
        str: The summarized text.
    """
    if not text:
        return "Text to summarize cannot be empty."

    # Pre-process the text: remove special characters and convert to lowercase
    formatted_text = re.sub('[^a-zA-Z]', ' ', text).lower()
    
    # Get a list of "stop words" (common words to ignore)
    stop_words = set([
        "i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your",
        "yours", "yourself", "yourselves", "he", "him", "his", "himself", "she",
        "her", "hers", "herself", "it", "its", "itself", "they", "them", "their",
        "theirs", "themselves", "what", "which", "who", "whom", "this", "that",
        "these", "those", "am", "is", "are", "was", "were", "be", "been", "being",
        "have", "has", "had", "having", "do", "does", "did", "doing", "a", "an",
        "the", "and", "but", "if", "or", "because", "as", "until", "while", "of",
        "at", "by", "for", "with", "about", "against", "between", "into", "through",
        "during", "before", "after", "above", "below", "to", "from", "up", "down",
        "in", "out", "on", "off", "over", "under", "again", "further", "then",
        "once", "here", "there", "when", "where", "why", "how", "all", "any",
        "both", "each", "few", "more", "most", "other", "some", "such", "no",
        "nor", "not", "only", "own", "same", "so", "than", "too", "very", "s",
        "t", "can", "will", "just", "don", "should", "now"
    ])
    
    # Tokenize the formatted text into words
    words = formatted_text.split()
    
    # Calculate word frequency, ignoring stop words
    word_freq = defaultdict(int)
    for word in words:
        if word not in stop_words:
            word_freq[word] += 1
            
    # Calculate sentence scores based on word frequency
    sentence_scores = defaultdict(int)
    sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
    
    for sentence in sentences:
        for word in sentence.split():
            if word.lower() in word_freq:
                sentence_scores[sentence] += word_freq[word.lower()]

    # Get the top N sentences with the highest scores
    summary_sentences = nlargest(num_sentences, sentence_scores, key=sentence_scores.get)
    
    return " ".join(summary_sentences)

def main():
    """
    Main function to run the Text Summarizer app.
    """
    print("--- Python Text Summarizer ---")
    print("Enter a block of text to summarize.")
    print("Type 'q' or 'quit' on a new line to finish your input.")

    text_input_lines = []
    while True:
        line = input()
        if line.lower() in ['q', 'quit']:
            break
        text_input_lines.append(line)
    
    full_text = " ".join(text_input_lines)
    
    if not full_text:
        print("No text was entered.")
        return
        
    try:
        num_sentences = int(input("How many sentences should the summary be? (e.g., 3): ").strip())
        if num_sentences <= 0:
            print("Number of sentences must be greater than zero. Using default of 3.")
            num_sentences = 3
    except ValueError:
        print("Invalid input. Using default of 3 sentences for the summary.")
        num_sentences = 3

    summary = summarize_text(full_text, num_sentences)
    
    print("\n--- Summary ---")
    print(summary)
    print("---------------")

# This ensures that main() is called only when the script is executed directly.
if __name__ == "__main__":
    main()

For more complex summarization tasks, you would typically use machine learning models and natural language processing (NLP) libraries.

Link List

Text Summarizer Project (Code) in Python

Text Summarizer in Python.

How to use this program:

Search This Blog

Insights

Popular Posts

Introduction to Python

Artificial Intelligence Tutorial | AI tutorial

Getting Started with Artificial Intelligence Using Python – A Beginner’s Guide

Database

Python Script Mode

Followers

Company

learn simple