白嫖 Moss 斯坦福文件查重

白嫖 Moss 斯坦福文件查重

如题,题主最近在做项目的时候,因为原创性,被要求跟题主自己以前上传过和借鉴过的 github 项目进行查重,这要求一出来,题主就懵逼了,手上并没有现成的查重工具,

上CSDN和 Github 上也没有找到合适的项目做这个的(都是简单的数据结构作业,不能满足题主的需求),于是经过一段时间的查找资料,发现 Stanford 有一个免费的叫 moss 的查重服务器,非常方便。

​不过因为是英文网站,搞得题主踩了一些坑才搞定,下面分享题主的实现经历,也给想要用 moss 查重的小伙伴一些参考 Moss 官网 http://theory.stanford.edu/~aiken/moss/

1 要求

  • linux 系统
  • 谷歌邮箱 (@gmail)
  • perl

Windows 系统可以参照这个 https://www.youtube.com/watch?v=4fiF2YVpJ8A

Perl Unbuntu 安装: https://blog.csdn.net/junweifan/article/details/7260401

2 具体操作流程

1. 首先,到 Moss 的官网

在这里插入图片描述

2. 打开自己的 谷歌邮箱,主题不用写,收件人 moss@moss.stanford.edu

在这里插入图片描述

3. 邮件内容如下:

在这里插入图片描述

4. 邮件发送之后,会在1分钟到2分钟之内回复你(假如不是,建议看看上面那一步错了)

在这里插入图片描述

​从这个红色箭头处,到最后复制粘贴到一个文件,取名叫moss(没有后缀名!

这里有一个小坑,就是你假如是直接复制粘贴到 Windows txt里面然后把后缀名删了,用 xftp 传到 linux 服务器里面,会因为字符编码不一样 perl 报错

在你自己的 linux 命令行用下面这行命令之后就可以运行了

perl -p -i -e "s/\R/\n/g" moss
5. 跑查重
  • 确保你要查重的文件,与 moss 文件在同一个文件夹里面

以我自己的查重为例:

drwxr-xr-x 2 root root  4096 Jul 11 12:29 ./
drwxr-xr-x 6 root root  4096 Jul 10 16:11 ../
-rw-r--r-- 1 root root  8983 Jul 10 17:05 file1.py
-rw-r--r-- 1 root root 10081 Jul 10 16:11 file2.py
-rwxr-xr-- 1 root root 11097 Jul 10 16:19 moss*

之后运行代码:

./moss -l python file1.py file2.py 

界面中会显示:

Checking files . . . 
OK
Uploading file1.py ...done.
Uploading file2.py ...done.
Query submitted.  Waiting for the server's response.
http://moss.stanford.edu/results/XXXXXXXXXX

然后根据这个网址访问就可以得到结果:

在这里插入图片描述

6 问题区

2022.08.22 更新:
很多人在评论区问邮件的事情,有可能是因为没用 谷歌邮箱 的问题
附上Moss官网 https://theory.stanford.edu/~aiken/moss/ 有问题可以先去官网看看

7. 官方文档参照:
#  moss [-l language] [-d] [-b basefile1] ... [-b basefilen] [-m #] [-c "string"] file1 file2 file3 ...
#
# The -l option specifies the source language of the tested programs.
# Moss supports many different languages; see the variable "languages" below for the
# full list.
#
# Example: Compare the lisp programs foo.lisp and bar.lisp:
#
#    moss -l lisp foo.lisp bar.lisp
#
#
# The -d option specifies that submissions are by directory, not by file.
# That is, files in a directory are taken to be part of the same program,
# and reported matches are organized accordingly by directory.
#
# Example: Compare the programs foo and bar, which consist of .c and .h
# files in the directories foo and bar respectively.
#
#    moss -d foo/*.c foo/*.h bar/*.c bar/*.h
#   
# Example: Each program consists of the *.c and *.h files in a directory under
# the directory "assignment1."
#
#    moss -d assignment1/*/*.h assignment1/*/*.c
#
#
# The -b option names a "base file".  Moss normally reports all code
# that matches in pairs of files.  When a base file is supplied,
# program code that also appears in the base file is not counted in matches.
# A typical base file will include, for example, the instructor-supplied
# code for an assignment.  Multiple -b options are allowed.  You should
# use a base file if it is convenient; base files improve results, but
# are not usually necessary for obtaining useful information.
#
# IMPORTANT: Unlike previous versions of moss, the -b option *always*
# takes a single filename, even if the -d option is also used.
#
# Examples:
#
#  Submit all of the C++ files in the current directory, using skeleton.cc
#  as the base file:
#
#    moss -l cc -b skeleton.cc *.cc
#
#  Submit all of the ML programs in directories asn1.96/* and asn1.97/*, where
#  asn1.97/instructor/example.ml and asn1.96/instructor/example.ml contain the base files.
#
#    moss -l ml -b asn1.97/instructor/example.ml -b asn1.96/instructor/example.ml -d asn1.97/*/*.ml asn1.96/*/*.ml
#
# The -m option sets the maximum number of times a given passage may appear
# before it is ignored.  A passage of code that appears in many programs
# is probably legitimate sharing and not the result of plagiarism.  With -m N,
# any passage appearing in more than N programs is treated as if it appeared in
# a base file (i.e., it is never reported).  Option -m can be used to control
# moss' sensitivity.  With -m 2, moss reports only passages that appear
# in exactly two programs.  If one expects many very similar solutions
# (e.g., the short first assignments typical of introductory programming
# courses) then using -m 3 or -m 4 is a good way to eliminate all but
# truly unusual matches between programs while still being able to detect
# 3-way or 4-way plagiarism.  With -m 1000000 (or any very
# large number), moss reports all matches, no matter how often they appear.  
# The -m setting is most useful for large assignments where one also a base file
# expected to hold all legitimately shared code.  The default for -m is 10.
#
# Examples:
#
#   moss -l pascal -m 2 *.pascal
#   moss -l cc -m 1000000 -b mycode.cc asn1/*.cc
#
#
# The -c option supplies a comment string that is attached to the generated
# report.  This option facilitates matching queries submitted with replies
# received, especially when several queries are submitted at once.
#
# Example:
#
#   moss -l scheme -c "Scheme programs" *.sch
#
# The -n option determines the number of matching files to show in the results.
# The default is 250.
#
# Example:
#   moss -c java -n 200 *.java
# The -x option sends queries to the current experimental version of the server.
# The experimental server has the most recent Moss features and is also usually
# less stable (read: may have more bugs).
#
# Example:
#
#   moss -x -l ml *.ml
The experimental server has the most recent Moss features and is also usually
# less stable (read: may have more bugs).
#
# Example:
#
#   moss -x -l ml *.ml
评论 18
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值