|
| 1 | +# DupliPy 0.1.1 |
| 2 | + |
| 3 | +An open source Python library for text formatting, augmentation, and similarity calculation tasks in NLP. |
| 4 | + |
| 5 | +## Installation |
| 6 | + |
| 7 | +You can install DupliPy using pip: |
| 8 | + |
| 9 | +```bash |
| 10 | +pip install duplipy |
| 11 | +``` |
| 12 | + |
| 13 | +## Supported Python Versions |
| 14 | + |
| 15 | +DupliPy supports the following Python versions: |
| 16 | + |
| 17 | +- Python 3.6 |
| 18 | +- Python 3.7 |
| 19 | +- Python 3.8 |
| 20 | +- Python 3.9 |
| 21 | +- Python 3.10 |
| 22 | +- Python 3.11 |
| 23 | + |
| 24 | +Please ensure that you have one of these Python versions installed before using DupliPy. DupliPy may not work as expected on lower versions of Python than the supported. |
| 25 | + |
| 26 | +## Features |
| 27 | + |
| 28 | +- Text Formatting: Remove special characters, standardize text formatting. |
| 29 | +- Text Replication: Generate replicated instances of text for data augmentation. |
| 30 | +- Sentiment Analysis: Find impressions within sentences. |
| 31 | +- Similarity Calculation: Calculate text similarity using various metrics. |
| 32 | +- BLEU Score Calculation: Calculate how well your text-based NLP model performs. |
| 33 | + |
| 34 | +## Usage |
| 35 | + |
| 36 | +### Text Formatting |
| 37 | + |
| 38 | +```python |
| 39 | +from duplipy.formatting import remove_special_characters, standardize_text |
| 40 | + |
| 41 | +text = "Hello! This is some text." |
| 42 | + |
| 43 | +# Remove special characters |
| 44 | +formatted_text = remove_special_characters(text) |
| 45 | +print(formatted_text) # Output: Hello This is some text |
| 46 | + |
| 47 | +# Standardize text formatting |
| 48 | +standardized_text = standardize_text(text) |
| 49 | +print(standardized_text) # Output: hello! this is some text |
| 50 | +``` |
| 51 | + |
| 52 | +### Text Replication |
| 53 | + |
| 54 | +```python |
| 55 | +from duplipy.replication import replace_word_with_synonym, augment_text_with_synonyms |
| 56 | + |
| 57 | +text = "Hello! This is some text." |
| 58 | + |
| 59 | +# Replace words with synonyms |
| 60 | +augmented_text = augment_text_with_synonyms(text, augmentation_factor=3, probability=0.5) |
| 61 | +print(augmented_text) |
| 62 | + |
| 63 | +# Output: |
| 64 | +# ['Hello! This is some text.', 'Hi! This is some text.', 'Hello! This is certain text.'] |
| 65 | + |
| 66 | +# Load text from a file and augment it |
| 67 | +file_path = "path/to/file.txt" |
| 68 | +augmented_file_text = augment_file_with_synonyms(file_path, augmentation_factor=3, probability=0.5) |
| 69 | +print(augmented_file_text) |
| 70 | +``` |
| 71 | + |
| 72 | +### Sentiment Analysis |
| 73 | + |
| 74 | +```python |
| 75 | +from duplipy.sentiment import analyze_sentiment |
| 76 | + |
| 77 | +text = "I love this product! It's amazing!" |
| 78 | + |
| 79 | +# Analyze sentiment |
| 80 | +sentiment = analyze_sentiment(text) |
| 81 | +print(sentiment) # Output: Positive |
| 82 | +``` |
| 83 | + |
| 84 | +### Similarity Calculation |
| 85 | + |
| 86 | +```python |
| 87 | +from duplipy.similarity import edit_distance_score |
| 88 | + |
| 89 | +text1 = "Hello! How are you?" |
| 90 | +text2 = "Hi! How are you doing?" |
| 91 | + |
| 92 | +# Calculate edit distance |
| 93 | +edit_distance = edit_distance_score(text1, text2) |
| 94 | +print(edit_distance) # Output: 4 |
| 95 | +``` |
| 96 | + |
| 97 | +### BLEU Score Calculation |
| 98 | + |
| 99 | +```python |
| 100 | +from duplipy.similarity import bleu_score |
| 101 | + |
| 102 | +text1 = "Hello! How are you?" |
| 103 | +text2 = "Hi! How are you doing?" |
| 104 | + |
| 105 | +# Calculate cosine similarity |
| 106 | +bleu_value = bleu_score(text1, text2) |
| 107 | +print(bleu_value) # Output: 0.434 |
| 108 | +``` |
| 109 | + |
| 110 | +## Contributing |
| 111 | + |
| 112 | +Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to DupliPy, please open an issue or submit a pull request on [GitHub](https://github.com/infinitode/duplipy). |
| 113 | + |
| 114 | +## License |
| 115 | + |
| 116 | +DupliPy is released under the terms of the **MIT License (Modified)**. Please see the [LICENSE](https://github.com/infinitode/duplipy/blob/main/LICENSE) file for the full text. |
| 117 | + |
| 118 | +**Modified License Clause** |
| 119 | + |
| 120 | + |
| 121 | + |
| 122 | +The modified license clause grants users the permission to make derivative works based on the DupliPy software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name. |
| 123 | + |
| 124 | +By enforcing this distinction, it aims to prevent direct publishing of the source code without changes while allowing users to create derivative works that incorporate the code but are not exactly the same. |
| 125 | + |
| 126 | +Please read the full license terms in the [LICENSE](https://github.com/your-username/duplipy/blob/main/LICENSE) file for complete details. |
0 commit comments