第一次编程作业JAVA
论文查重
| 工程概论 | https://edu.cnblogs.com/campus/jmu/ComputerScience21/homework/13034 |
| ----------------- |--------------- |
| 这个作业要求在哪里| https://edu.cnblogs.com/campus/jmu/ComputerScience21/homework/13034 |
| 论文查重 | 通过代码将文字的比较转化为共有单词的比较 |
import java.util.HashMap;
import java.util.Map;
public class PaperPlagiarismChecker {
public static void main(String[] args) {
String paper1 = "This is the content of paper 1.";
String paper2 = "This is the content of paper 2.";
double similarity = calculateSimilarity(paper1, paper2);
System.out.println("Similarity: " + similarity);
}
private static double calculateSimilarity(String paper1, String paper2) {
Map<String, Integer> wordFrequency1 = calculateWordFrequency(paper1);
Map<String, Integer> wordFrequency2 = calculateWordFrequency(paper2);
int commonWords = 0;
int totalWords = 0;
for (String word : wordFrequency1.keySet()) {
if (wordFrequency2.containsKey(word)) {
commonWords += Math.min(wordFrequency1.get(word), wordFrequency2.get(word));
}
totalWords += wordFrequency1.get(word);
}
double similarity = (double) commonWords / totalWords;
return similarity;
}
private static Map<String, Integer> calculateWordFrequency(String paper) {
Map<String, Integer> wordFrequency = new HashMap<>();
String[] words = paper.toLowerCase().split("\\s+");
for (String word : words) {
wordFrequency.put(word, wordFrequency.getOrDefault(word, 0) + 1);
}
return wordFrequency;
}
}
思路
,我计算了两篇论文的相似度。首先,我们将每篇论文的内容转换为小写,并按空格分割成单词。然后,使用一个HashMap来计算每个单词在论文中的频率。接下来,我们比较两篇论文中的单词频率,计算共同单词的数量,并将其除以总单词数,得到相似度。但我不会比较文件之间的重复.....
浙公网安备 33010602011771号