๐—ง๐—ฒ๐˜…๐˜ ๐—ฆ๐—ถ๐—บ๐—ถ๐—น๐—ฎ๐—ฟ๐—ถ๐˜๐˜† ๐—ถ๐—ป ๐—ก๐—Ÿ๐—ฃ

You want to find similarity between two texts. I tested several ways to do this.

Cosine similarity works best. It turns text into vectors. It looks at the angle between vectors. 0 means the texts are different. 1 means they are same.

Use dynamic programming for custom logic. This helps when you need specific character rules.

Use ROUGE to get a confidence score. It counts n-grams in both strings. You track three metrics:

Focus on precision for close matches. Balance it with recall for total matches.

I also tried regex patterns. I used rewards and penalties for wildcards. This improved the results.

Source: https://dev.to/sirisha_chiruvolu_f5136d5/finding-similarity-scores-between-text-in-natural-language-processing-239n Optional learning community: https://t.me/GyaanSetuAi