C语言实现拼写检查器(C-Spellchecker)



原题目连接:Implementations of a Spell Checker

C-Spellchecker实现

Introduction

You will create a very basic spell checker. Your program will read the “dictionary” (list of words) into a Set, then check the words in the input against the Set of words. You will provide several implementations of the Set ADT for strings.

We’ve provided an interface to a Set of strings. Recall, a Set is an unordered, finite collection of unique keys.

The Set ADT

You will implement the Set over:

  • A Linked List of strings
  • A sorted Array of strings

The Interface

The Set will have the following functions:

  • setSize — return the size of the Set
  • setInsert — insert an element into the Set
  • setFind — check if element is in the Set

We’ll need 2 other functions for creation and cleanup:

  • set() — create an empty Set
  • setKill() — decommission the Set; return all heap memory, etc.

Look at ~kschmidt/public_html/CS265/Assignments/C-Spellchecker/set.h .

Set is declared in the interface (.h) file. It is simply the set_t type, which might be different for each of the implementations.

All of the functions take a pointer to a Set (Set*), so that the Set itself can be modified as needed.

The Implementations

All of these will copy the key (a string) into the set. That is, when you add an item to the Set, it will be copied into heap memory, and a pointer to that memory will be stored. See strdup. It’ll do most of the work for you.

Remember, strings are arrays of characters. You cannot assign arrays, nor strings, as you would, e.g., integers. Draw pictures. Also, for the same reason, you can’t use relational operators. If you have an ASCII string, you can use the strcmp function.

When removing, or decommissioning, all heap memory must be returned.

Also note, for each implementation X.c, X.c and main.c are compiled together to make executble X. See the Makefile .

None of the implementations will store duplicates, so your Inserts must check.

Each of the implementations must define the set_t type, to store the actual Set.

Files are in the assignment directory.

Unordered Linked List

  • Implementation file: ll.c
  • Makefile target: ll

Sorted Array

  • Implementation file: arrs.c
  • Makefile target: arrs

As you read a word, insert it into its correct place w/in the array, so that the array is always sorted. Note, this is the Insertion Sort .

Note, if you have a sorted array, then you can find things more quickly than simply plodding through the array. Have the setFind implement the Binary Search.

The Dictionary

In main.c you will write your client code, that uses your Set to check a file.

The location of the dictionary you are to use will be provided through the environment variable WORDS . Read all of the words in the dictionary into memory, using your Set.

This file will contain one word per line. You may not assume that the words in the file are given in alphabetical order.

For this assignment you may assume that the number of words in the dictionary is bound above. See the appropriate variable in set.h . If the dictionary has more words, print an informative error message to stderr , then exit (after cleaning up any memory).

The Input File to be Checked

You will read the name of the file to be checked as a command-line argument. If there is no filename given, you will read stdin. See CS265/Labs/C/fgets.c for an example of doing this.

For this assignment, do not treat punctuation specially. Do not attempt to parse it out. Just take any token separated by white space to be a word to be checked.

For this assignment you may assume that the length of a single word is bound above. See the appropriate variable in set.h .

For each word in the file, if it is not in your set, assume that it is misspelt. Print each misspelt word you find to stdout, one per line, single-spaced.

Output

Print to stdout every occurrence of each misspelt word in the input file, one per line, in the order in which you encounter them. Just the word. No decoration, no commentary.


  
set.h

#ifndef __MY_SET_H__
#define __MY_SET_H__

#include <stddef.h>

//interface for the Set

//  Constants  //

enum consts {
    
	MAX_SET_SIZE = 30000 ,
	MAX_WORD_SIZE = 80 
} ;



//  Set i/f  //

typedef struct set_t Set ;

	// set - factory function
	// Returns a pointer to a new, empty set
	// Should be disposed of using setKill
Set* set() ;

	// setSize - return size of the set pointed to by s
size_t	setSize( const Set* s ) ;

	// return 1 if x in Set pointed to by s
_Bool setFind( const Set* s, const char* x ) ;

	// setInsert - Insert string x into set pointed to by s
	// x is copied into heap memory (see strdup)
	// return true (1) if x was successfully inserted
	// false (0) otherwise (x is already in set, no memory, etc.)
_Bool setInsert( Set* s, const char* x ) ;

	// setKill - Decommissions 
void setKill( Set* ) ;


#endif // __MY_SET_H__




  

arr.c


//Sorted array implementation of the Set

#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <inttypes.h> 
#include "set.h"

#define MAX_SET_SIZE 130000

struct set_t {
   
	char* words[ MAX_SET_SIZE ] ;
	size_t n ;
};

Set * set()
{
   
	size_t i = 0 ;
	Set *data = (struct set_t*)malloc(sizeof(struct set_t));
	if (data == NULL) {
   
        return NULL;
    }
    for(i=0;i<MAX_SET_SIZE;i++) data->words[i]=NULL;
	data->n = 0;
	return data;
}

void setPrintf( Set* s )
{
   
	for(int i=0;i<s->n;i++)
	{
   
		printf("%s",s->words[i]);
	}
}

size_t	setSize( const Set* s )
{
   
	return s->n ;
}

static bool isStrExit(Set* s, const char *dest, int left, int right)
{
   
	if (left<=right && (s->n)>1)
	{
   
		int mid = left + (right - left) / 2;
		if (strcmp(s->words[mid], dest) == 0)
		{
   
			return true;
		}
		else if (strcmp(s->words[mid], dest) > 0)
		{
   
			return isStrExit(s, dest, left, mid - 1);
		}
		else
		{
   
			return isStrExit(s, dest, mid + 1, right);
		
  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值