Skip to content

Commit 8c81888

Browse files
authored
Added DupliPy testing files
1 parent 52d3319 commit 8c81888

4 files changed

Lines changed: 188 additions & 0 deletions

File tree

LICENSE

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
MIT License (Modified)
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to make derivative works based on the Software, provided that any substantial changes to the Software are clearly distinguished from the original work and are distributed under a different name.
4+
5+
The original copyright notice and disclaimer must be retained in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS," WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

readme.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# DupliPy 0.1.1
2+
3+
An open source Python library for text formatting, augmentation, and similarity calculation tasks in NLP.
4+
5+
## Installation
6+
7+
You can install DupliPy using pip:
8+
9+
```bash
10+
pip install duplipy
11+
```
12+
13+
## Supported Python Versions
14+
15+
DupliPy supports the following Python versions:
16+
17+
- Python 3.6
18+
- Python 3.7
19+
- Python 3.8
20+
- Python 3.9
21+
- Python 3.10
22+
- Python 3.11
23+
24+
Please ensure that you have one of these Python versions installed before using DupliPy. DupliPy may not work as expected on lower versions of Python than the supported.
25+
26+
## Features
27+
28+
- Text Formatting: Remove special characters, standardize text formatting.
29+
- Text Replication: Generate replicated instances of text for data augmentation.
30+
- Sentiment Analysis: Find impressions within sentences.
31+
- Similarity Calculation: Calculate text similarity using various metrics.
32+
- BLEU Score Calculation: Calculate how well your text-based NLP model performs.
33+
34+
## Usage
35+
36+
### Text Formatting
37+
38+
```python
39+
from duplipy.formatting import remove_special_characters, standardize_text
40+
41+
text = "Hello! This is some text."
42+
43+
# Remove special characters
44+
formatted_text = remove_special_characters(text)
45+
print(formatted_text) # Output: Hello This is some text
46+
47+
# Standardize text formatting
48+
standardized_text = standardize_text(text)
49+
print(standardized_text) # Output: hello! this is some text
50+
```
51+
52+
### Text Replication
53+
54+
```python
55+
from duplipy.replication import replace_word_with_synonym, augment_text_with_synonyms
56+
57+
text = "Hello! This is some text."
58+
59+
# Replace words with synonyms
60+
augmented_text = augment_text_with_synonyms(text, augmentation_factor=3, probability=0.5)
61+
print(augmented_text)
62+
63+
# Output:
64+
# ['Hello! This is some text.', 'Hi! This is some text.', 'Hello! This is certain text.']
65+
66+
# Load text from a file and augment it
67+
file_path = "path/to/file.txt"
68+
augmented_file_text = augment_file_with_synonyms(file_path, augmentation_factor=3, probability=0.5)
69+
print(augmented_file_text)
70+
```
71+
72+
### Sentiment Analysis
73+
74+
```python
75+
from duplipy.sentiment import analyze_sentiment
76+
77+
text = "I love this product! It's amazing!"
78+
79+
# Analyze sentiment
80+
sentiment = analyze_sentiment(text)
81+
print(sentiment) # Output: Positive
82+
```
83+
84+
### Similarity Calculation
85+
86+
```python
87+
from duplipy.similarity import edit_distance_score
88+
89+
text1 = "Hello! How are you?"
90+
text2 = "Hi! How are you doing?"
91+
92+
# Calculate edit distance
93+
edit_distance = edit_distance_score(text1, text2)
94+
print(edit_distance) # Output: 4
95+
```
96+
97+
### BLEU Score Calculation
98+
99+
```python
100+
from duplipy.similarity import bleu_score
101+
102+
text1 = "Hello! How are you?"
103+
text2 = "Hi! How are you doing?"
104+
105+
# Calculate cosine similarity
106+
bleu_value = bleu_score(text1, text2)
107+
print(bleu_value) # Output: 0.434
108+
```
109+
110+
## Contributing
111+
112+
Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to DupliPy, please open an issue or submit a pull request on [GitHub](https://github.com/infinitode/duplipy).
113+
114+
## License
115+
116+
DupliPy is released under the terms of the **MIT License (Modified)**. Please see the [LICENSE](https://github.com/infinitode/duplipy/blob/main/LICENSE) file for the full text.
117+
118+
**Modified License Clause**
119+
120+
121+
122+
The modified license clause grants users the permission to make derivative works based on the DupliPy software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name.
123+
124+
By enforcing this distinction, it aims to prevent direct publishing of the source code without changes while allowing users to create derivative works that incorporate the code but are not exactly the same.
125+
126+
Please read the full license terms in the [LICENSE](https://github.com/your-username/duplipy/blob/main/LICENSE) file for complete details.

setup.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
from setuptools import setup, find_packages
2+
3+
setup(
4+
name='duplipy',
5+
version='0.1.6',
6+
author='Infinitode Pty Ltd',
7+
author_email='infinitode.ltd@gmail.com',
8+
description='A package for formatting and text replication.',
9+
long_description='DupliPy is a quick and easy-to-use package that can handle text formatting and data augmentation tasks for NLP in Python.',
10+
long_description_content_type='text/markdown',
11+
url='https://github.com/infinitode/duplipy',
12+
packages=find_packages(),
13+
install_requires=[
14+
'nltk',
15+
'numpy',
16+
'langcodes',
17+
'joblib',
18+
'tqdm',
19+
],
20+
classifiers=[
21+
'Development Status :: 3 - Alpha',
22+
'Intended Audience :: Developers',
23+
'License :: OSI Approved :: MIT License',
24+
'Programming Language :: Python :: 3',
25+
'Programming Language :: Python :: 3.6',
26+
'Programming Language :: Python :: 3.7',
27+
'Programming Language :: Python :: 3.8',
28+
'Programming Language :: Python :: 3.9',
29+
'Programming Language :: Python :: 3.10',
30+
'Programming Language :: Python :: 3.11',
31+
],
32+
python_requires='>=3.6',
33+
)

test_duplipy.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
from duplipy import *
2+
3+
4+
text = "The quick brown fox jumps over the lazy dog.545 g w 4 "
5+
print("Removed stopwords: ", remove_stopwords(text))
6+
print("Removed numbers: ", remove_numbers(text))
7+
print("Removed whitespace: ", remove_whitespace(text))
8+
print("Normalize whitespace: ", normalize_whitespace(text))
9+
print("Seperate symbols: ", separate_symbols(text))
10+
print("Remove special characters: ", remove_special_characters(text))
11+
print("Standardize text: ", standardize_text(text))
12+
print("Tokenize text: ", tokenize_text(text))
13+
print("POS tag: ", pos_tag(text))
14+
print("Insert random word (call): ", insert_random_word(text, "call"))
15+
print("Delete random word: ", delete_random_word(text))
16+
print("Insert synonym (touch): ", insert_synonym(text, "feel"))
17+
print("Paraphrased: ", paraphrase(text))
18+
19+
print("Edit distance score between Hi and Hello: ", edit_distance_score("hi", "hello"))
20+
print("BLEU score calculation: ", bleu_score("Hello, how are you?", "Hi, how are you doing?"))
21+
22+
print("Analyze sentiment: ", analyze_sentiment(text))

0 commit comments

Comments
 (0)