𝗧𝗲𝘅𝘁 𝗦𝗶𝗺𝗶𝗹𝗮𝗿𝗶𝘁𝘆 𝗶𝗻 𝗡𝗟𝗣

📅1 week ago⏱1 min read

You want to find similarity between two texts. I tested several ways to do this.

Cosine similarity works best. It turns text into vectors. It looks at the angle between vectors. 0 means the texts are different. 1 means they are same.

Use dynamic programming for custom logic. This helps when you need specific character rules.

Use ROUGE to get a confidence score. It counts n-grams in both strings. You track three metrics:

Recall: Overlap vs original text length.
Precision: Overlap vs generated text length.
F1 Score: The balance between both.

Focus on precision for close matches. Balance it with recall for total matches.

I also tried regex patterns. I used rewards and penalties for wildcards. This improved the results.

Source: https://dev.to/sirisha_chiruvolu_f5136d5/finding-similarity-scores-between-text-in-natural-language-processing-239n Optional learning community: https://t.me/GyaanSetuAi

𝗧𝗲𝘅𝘁 𝗦𝗶𝗺𝗶𝗹𝗮𝗿𝗶𝘁𝘆 𝗶𝗻 𝗡𝗟𝗣

Continue reading

𝗩𝗲𝗰𝘁𝗼𝗿 𝗧𝗮𝗯𝗹𝗲𝘀 𝟭𝟬𝟭: 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱𝗶𝗻𝗴 𝗩𝗲𝗰𝘁𝗼𝗿 𝗮𝗻𝗱 𝗣𝗚𝗩𝗲𝗰𝘁𝗼𝗿

𝗗𝗶𝘀𝘁𝗮𝗻𝗰𝗲𝘀 𝗮𝗻𝗱 𝗦𝗶𝗺𝗶𝗹𝗮𝗿𝗶𝘁𝘆 𝗶𝗻 𝗠𝗟

𝗗𝗶𝘀𝘁𝗮𝗻𝗰𝗲𝘀 𝗮𝗻𝗱 𝗦𝗶𝗺𝗶𝗹𝗮𝗿𝗶𝘁𝘆 𝗶𝗻 𝗠𝗟

𝗜𝗱𝗲𝗻𝘁𝗶𝗳𝘆𝗶𝗻𝗴 𝘁𝗵𝗲 𝗚𝗮𝗽: 𝗨𝘀𝗶𝗻𝗴 𝗔𝗜 𝗳𝗼𝗿 𝗠𝗮𝗻𝘂𝘀𝗰𝗿𝗶𝗽𝘁 𝗔𝗻𝗮𝗹𝘆𝘀𝗶𝘀

𝗕𝗲𝘆𝗼𝗻𝗱 𝗥𝗔𝗚: 𝗪𝗵𝗮𝘁 𝗔𝗿𝗲 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴𝘀 𝗶𝗻 𝗔𝗜?