Text Summarizer Project (Code) in Python

← Back to Projects

Text Summarizer in Python.

About the project: This is a Text Summarizer project in Python.

This will be a standalone script that summarizes a block of text using a simple, frequency-based algorithm.

This program will take a text input, analyze the frequency of words, and then select the most important sentences to create a summary.


How to use this program:

To use this Text Summarizer, simply save the code as a Python file and run it from your terminal:

Save the code: Save the code as a Python file (e.g., text_summarizer.py).

Run the script: Open your terminal, navigate to the directory where you saved the file, and run python text_summarizer.py.

The program will prompt you to enter your text. Once you are done, type q or quit on a new line and press Enter.

You will then be asked for the desired length of the summary in sentences.


  • The program will then display the summarized text.
  • This summarizer is a basic implementation and works best on well-structured text.

Project Level: Intermediate

You can directly copy the below snippet code with the help of green copy button, paste it and run it in any Python editor you have.

Steps: Follow these steps

Step 1: Copy below code using green 'copy' button.

Step 2: Paste the code on your chosen editor.

Step 3: Save the code with filename and .py extention.

Step 4: Run (Press F5 if using python IDLE)




# text_summarizer.py

import re
from collections import defaultdict
from heapq import nlargest

def summarize_text(text, num_sentences=3):
    """
    Summarizes a block of text using a simple frequency-based method.
    
    Args:
        text (str): The full text to be summarized.
        num_sentences (int): The number of sentences to include in the summary.
                             Defaults to 3.

    Returns:
        str: The summarized text.
    """
    if not text:
        return "Text to summarize cannot be empty."

    # Pre-process the text: remove special characters and convert to lowercase
    formatted_text = re.sub('[^a-zA-Z]', ' ', text).lower()
    
    # Get a list of "stop words" (common words to ignore)
    stop_words = set([
        "i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your",
        "yours", "yourself", "yourselves", "he", "him", "his", "himself", "she",
        "her", "hers", "herself", "it", "its", "itself", "they", "them", "their",
        "theirs", "themselves", "what", "which", "who", "whom", "this", "that",
        "these", "those", "am", "is", "are", "was", "were", "be", "been", "being",
        "have", "has", "had", "having", "do", "does", "did", "doing", "a", "an",
        "the", "and", "but", "if", "or", "because", "as", "until", "while", "of",
        "at", "by", "for", "with", "about", "against", "between", "into", "through",
        "during", "before", "after", "above", "below", "to", "from", "up", "down",
        "in", "out", "on", "off", "over", "under", "again", "further", "then",
        "once", "here", "there", "when", "where", "why", "how", "all", "any",
        "both", "each", "few", "more", "most", "other", "some", "such", "no",
        "nor", "not", "only", "own", "same", "so", "than", "too", "very", "s",
        "t", "can", "will", "just", "don", "should", "now"
    ])
    
    # Tokenize the formatted text into words
    words = formatted_text.split()
    
    # Calculate word frequency, ignoring stop words
    word_freq = defaultdict(int)
    for word in words:
        if word not in stop_words:
            word_freq[word] += 1
            
    # Calculate sentence scores based on word frequency
    sentence_scores = defaultdict(int)
    sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
    
    for sentence in sentences:
        for word in sentence.split():
            if word.lower() in word_freq:
                sentence_scores[sentence] += word_freq[word.lower()]

    # Get the top N sentences with the highest scores
    summary_sentences = nlargest(num_sentences, sentence_scores, key=sentence_scores.get)
    
    return " ".join(summary_sentences)

def main():
    """
    Main function to run the Text Summarizer app.
    """
    print("--- Python Text Summarizer ---")
    print("Enter a block of text to summarize.")
    print("Type 'q' or 'quit' on a new line to finish your input.")

    text_input_lines = []
    while True:
        line = input()
        if line.lower() in ['q', 'quit']:
            break
        text_input_lines.append(line)
    
    full_text = " ".join(text_input_lines)
    
    if not full_text:
        print("No text was entered.")
        return
        
    try:
        num_sentences = int(input("How many sentences should the summary be? (e.g., 3): ").strip())
        if num_sentences <= 0:
            print("Number of sentences must be greater than zero. Using default of 3.")
            num_sentences = 3
    except ValueError:
        print("Invalid input. Using default of 3 sentences for the summary.")
        num_sentences = 3

    summary = summarize_text(full_text, num_sentences)
    
    print("\n--- Summary ---")
    print(summary)
    print("---------------")

# This ensures that main() is called only when the script is executed directly.
if __name__ == "__main__":
    main()





For more complex summarization tasks, you would typically use machine learning models and natural language processing (NLP) libraries.