【面试题】大数据计算

最新推荐文章于 2022-05-28 16:06:23 发布

sayhello_world

最新推荐文章于 2022-05-28 16:06:23 发布

阅读量1.1k

点赞数 1

分类专栏：数据结构文章标签：大数据大数据计算面试题

本文链接：https://blog.csdn.net/sayhello_world/article/details/70856724

版权

数据结构专栏收录该内容

16 篇文章 2 订阅

订阅专栏

大数据计算

例题：

有一组数字，这组共有10亿个数字，数字位数为1-64位，求出这组数字的平均值。

那么这个数字之和可能是超过的范围的，longlong的范围是

（-9223372036854775808~9223372036854775807），很有可能数字已经越界，那么该怎么办？

思路：用字符数组存储越界的数字，在字符数组内进行加减乘除。然后返回字符数组即可。

操作：

加法操作：

1.如果两个数字在范围内都没有越界

如果两个数字符号不同，直接相加，因为符号不同会越减越小。

如果符号相同，则在范围之内可以相加。

2.如果一个或者两个越界了

如果符号相同，调用加法函数。

符号不同调用减法函数。

减法操作：

1.如果两个数字在范围内都没有越界

如果符号相同，则调用减法函数。（因为可能同正相减，也有可能同负相减）

如果符号相反，也分为同正或者同负，此时可能第一个为正第二个为负，也可能相反，此时需要把第二个数字符号改变相反，调用加法即可。

2.如果两个数字有一个或者都越界

如果符号相同，则调用减法方法。

如果符号不同，被减数符号取反。调用加法方法。

乘法操作：

1.结果为0

乘数与被乘数其中一个为0，结果为0

2.结果为（正负）被乘数

乘数正负1时，结果为被乘数。

3.结果为（正负）乘数

与上面相同

4.普通相乘

调用乘法函数

除法操作：

1.结果为0

分子为0

2.如果两者都不超过范围

直接相除

3.如果分子小于分母

结果返回0

4.如果分子绝对值等于分母

则返回（正负）1

5.如果分母为（正负）1

则返回（正负）分子

6.否则调用除法函数

加法函数：

思路：将位数长的换到左边，从字符数组中取每一位，每一位每一位相加。

进的位数等于相加结果除10，再把进位加到上一位中。

减法函数：

思路：减法可能为一正一负相加，也可能为同号想见。

如果为同正，则把较大的值放到左边，如果同负，那么当第一个比第二个大的时候结果为负。

负责，每一位想见，若减完结果小于0，则向前借位，前一位减一，此为+10。

乘法函数：

思路：如果两数字符号不同，则肯定为负数。否则设置两个位，一个是移位，一个是进位。

双层循环嵌套，如果左边这一位为0，则直接移位。

否则，左右最低位相乘，再相加。和除10为进位数。

每一次内循环后要向前移一位。

除法函数：

思路：连续减法求除法结果，如果左边的数字比右边的小，则左边的为0，如果左边比右边大，则相除

BigData.cpp

#include "BigData.h"

const INT64 UN_INT = 0xcccccccccccccccc;
const INT64 MAXValue = 9223372036854775807;
const INT64 MINValue = (9223372036854775807 + 1);

BigData::BigData(INT64 value = UN_INT)
:_value(value)
{
	//把数字放到字符串中
	char symbol = '+';
	//只有当他_value小于0时，value应该变为正数，因为符号已经存储，所以把数字统一全部变为正数即可
	if (_value < 0)
	{
		symbol = '-';
		_value = 0 - _value;
	}

	_strData.append(1, symbol);
	int count = 1;

	if (_value == 0)
	{
		_strData.append(1, '0');
		return;
	}

	while (_value > 0)
	{
		char temp = _value % 10 + '0';
		_value = _value / 10;
		_strData.append(1, temp);
	}
	std::reverse(_strData.begin() + 1, _strData.end());
}

BigData::BigData(const std::string& strData)
: _value(0)
, _strData("+0")
{
	//如果字符串为空
	if (strData.empty())
		return;

	//"      012345"  "        "
	//先跳过空白字符
	char* pData = (char*)strData.c_str();
	while (isspace(*pData))
		pData++;

	if (*pData == '\0')
		return;

	//"+1234567890" "-123456789"
	//把空白字符跳过后无非就是数字字符或者符号位
	//如果直接是符号位就跳过 否则是数字字符就直接按照正数来
	char symbol = *pData;
	if (*pData == '+' || *pData == '-')
		pData++;
	else if (*pData >= '0' && *pData <= '9')
		symbol = '+';
	else return;

	//跳过数字字符前置的0
	//先判断如果这个字符串中只有一个0
	if (strData.size() == 2 && '\0' == *pData)
	{
		_value = 0;
		return;
	}

	while ('0' == *pData)
		pData++;

	if ('\0' == pData)
		return;

	//否则就是存储字符了
	_strData.resize(strlen(pData) + 1);
	_strData[0] = symbol;

	size_t count = 1;
	while (*pData >= '0' && *pData <= '9')
	{
		_value = _value * 10 + *pData - '0';
		_strData[count++] = *pData;
		pData++;
	}

	if (symbol == '-')
		_value = 0 - _value;

	_strData.resize(count);
}

BigData BigData::operator+(const BigData& b)
{
	if (!IsINT64OverFlow() && !b.IsINT64OverFlow())
	{
		//如果符号相反 直接相加即可 因为无论如何不会超过范围
		if (_strData[0] != b._strData[0])
			return BigData(_value + b._value);
		else
		{
			//如果符号相同 则分为同正或者同负 若想加同样不超过范围则直接相加即可
			//同正 10  >= 2+8
			//同负 -1 + -2 <= -10
			if ((_strData[0] == '+' && MAXValue - _value >= b._value) ||
				(_strData[0] == '-' && MINValue - _value <= b._value))
				return BigData(_value + b._value);
		}
	}
	//如果一个或者两个超出范围了
	if (_strData[0] == b._strData[0])
		return BigData(Add(_strData, b._strData));
	else
		return BigData(Sub(_strData, b._strData));
}

BigData BigData::operator-(BigData& b)
{
	if (!IsINT64OverFlow() && !b.IsINT64OverFlow())
	{
		//在范围内且符号相同 直接大数减小数即可
		if (_strData[0] == b._strData[0])
			return Sub(_strData, b._strData);
		else
		{
			//如果符号不同 则分为同正或者同负 若想加同样不超过范围则直接相加即可
			//同正 10  >= 2+8
			//同负 -1 + -2 <= -10
			if ((_strData[0] == '+' && b._strData[0] == '-' && MAXValue >= _value - b._value) ||
				(_strData[0] == '-' && b._strData[0] == '+' && MINValue <= _value - b._value))
			{
				b._value = 0 - b._value;
				return BigData(_value + b._value);
			}
		}
	}
	//如果一个或者两个超出范围了
	//这时分为四种情况
	//(-1) - (-1) = (-1) + 1
	//1 - 1  = 1 +(-1)
	//上述两种情况应该调用减法 两者符号相同
	//1-(-1) = 1 + 1
	//(-1) - 1 = -1 + (-1)
	//上面两个符号不同 如果度过第二个为- 则变为正 否则 变为- 调用加法
	//所以可写如下代码
	if (_strData[0] == b._strData[0])
	{
		return BigData(Sub(_strData, b._strData));
	}
	else
	{
		if (b._strData[0] == '-')
			b._strData[0] = '+';
		else
			b._strData[0] = '-';
		return BigData(Add(_strData, b._strData));
	}

}

BigData BigData::operator*(BigData& b)
{
	if (_value == 0 || b._value == 0)
		return BigData(0);
	else if (strcmp(_strData.c_str() + 1, "1") == 0)
	{
		if (_strData[0] == '-')
			b._strData[0] = '-';
		return BigData(b._strData);
	}

	else if (strcmp(b._strData.c_str() + 1, "1") == 0)
	{
		if (_strData[0] == '-')
			_strData[0] = '-';
		return BigData(_strData);
	}

	else
		return BigData(Mul(_strData, b._strData));
}

BigData BigData::operator/(const BigData& b)
{
	//除下来为0，-1,1，还有正常除

	//分母不能为0
	if (b._value == 0)
	{
		cout << "除数不能为0" << endl;
		return BigData(0);
	}

	if (!IsINT64OverFlow() && !b.IsINT64OverFlow())
		return BigData(_value / b._value);

	//商为0 分子为0
	//分子小于分母
	if ("+0" == _strData || _strData.size() < b._strData.size() ||
		(_strData.size() == b._strData.size() && strcmp(_strData.c_str() + 1, b._strData.c_str() + 1)))
		return BigData(0);

	//等于1
	if (strcmp(_strData.c_str() + 1, b._strData.c_str() + 1) == 0)
	{
		BigData bg(1);
		if (_strData[0] != b._strData[0])
		{
			bg._value = -1;
			bg._strData[0] = '-';
		}

		return bg;
	}

	//等于原来的值
	if (strcmp(b._strData.c_str() + 1, "1") == 0)
	{
		BigData bg(_strData);
		if ('-' == b._strData[0])
		{
			if (_strData[0] == '+')
				bg._strData[0] = '-';
			else
				bg._strData[0] = '+';
		}
		return bg;
	}

	//否则就是调用除法
	return Div(_strData, b._strData);
}

std::string BigData::Div(std::string left, std::string right)
{
	char symbol = '+';
	if (left[0] != right[0])
		symbol = '-';
	std::string strRet;
	strRet[0] = symbol;

	char* Pleft = (char*)left.c_str() + 1;
	char* Pright = (char*)right.c_str() + 1;
	int Lsize = left.size() - 1;
	int Rsize = right.size() - 1;
	int len = Lsize;

	while (*(Pleft + len - 1) != '\0')
	{
		//如果左边比右边小 则把这一位变为0 
		if (!IsLeftBig(Pleft, Lsize, Pright, Rsize))
		{
			len++;
			strRet.append(1, '0');
			continue;
		}
		//如果左边比右边大 则相除
		else{
			strRet.append(1, SubLoop(Pleft, len, Pright, Rsize));
			len++;
		}
	}
	return strRet;
}

std::string BigData::Add(std::string left, std::string right)
{
	int LSize = left.size();
	int RSize = right.size();

	//将位数长的换到左边
	if (LSize < RSize)
	{
		std::swap(left, right);
		std::swap(LSize, RSize);
	}

	std::string strRet;
	//这里要多一位是符号位
	strRet.resize(LSize + 1);
	strRet[0] = left[0];

	char step = 0;
	for (int idx = 1; idx < LSize; ++idx)
	{
		char temp = left[LSize - idx] - '0' + step;
		if (RSize > idx)
			temp += right[RSize - idx] - '0';
		step = temp / 10;
		strRet[LSize + 1 - idx] = temp % 10 + '0';
	}
	strRet[1] = step + '0';
	return strRet;
}

//当为减法的时候 可能为一正一副相加 可能为同号相减
std::string BigData::Sub(std::string left, std::string right)
{
	int LSize = left.size();
	int RSize = right.size();
	char symbol = '+';

	//若同正 把较大的数字放到左边
	if (LSize < RSize ||
		(LSize == RSize && left < right))
	{
		std::swap(left, right);
		std::swap(LSize, RSize);
		if (right[0] == '+')
			symbol = '-';
	}
	//这里如果两个都是负的 那么当第一个比第二个大的时候 结果为-
	else if (right[0] == '-' && strcmp(left.c_str(), right.c_str()) > 0)
		symbol = '-';

	std::string strRet;
	strRet.resize(LSize + 1);
	strRet[0] = symbol;

	char step = 0;
	char temp;
	int idx;
	for (idx = 1; idx < LSize; ++idx)
	{
		temp = left[LSize - idx] - '0';
		if (RSize > idx)
			temp -= right[RSize - idx] - '0';
		//借位
		if (temp < 0)
		{
			left[LSize - idx - 1] -= 1;
			temp += 10;
		}
		strRet[LSize - idx + 1] = temp + '0';
	}
	strRet[LSize - idx + 1] = temp + '0';
	return strRet;
}

std::string BigData::Mul(std::string left, std::string right)
{
	int LSize = left.size();
	int RSize = right.size();
	char symbol = '+';

	if (left[0] != right[0])
		symbol = '-';

	//把较小的数字放到左边 外层循环处
	if (LSize > RSize)
	{
		std::swap(left, right);
		std::swap(LSize, RSize);
	}

	size_t resSize = LSize + RSize - 1;
	std::string strRet(resSize, '0');
	strRet[0] = symbol;

	char offset = 0;//每一次乘都需要移位
	char step = 0;//进位

	//两次循环 一次是进位 一次是每一个相乘
	for (int i = 1; i < LSize; ++i)
	{
		char cLeft = left[LSize - i] - '0';
		step = 0;
		if (cLeft == 0)
		{
			//如果左边的某一位为零 则直接移位 不需要再挨个乘
			offset++;
			continue;
		}

		for (int j = 1; j < RSize; ++j)
		{
			//这里cleft已经不是字符 他代表数字 但是right还是字符
			char temp = cLeft * (right[RSize - j] + step - '0');
			temp = temp + strRet[LSize + RSize - 1 - j - offset] - '0';//temp为相乘后上下相加之和 减offset是为了防止0移位的时候没有减
			step = temp / 10;//进位数
			strRet[LSize + RSize - 1 - j - offset] = temp % 10 + '0';//余的位数
		}
		//每一次循环后向前移一位
		offset++;
	}
	//因为Lsize 应该会比最后的结果少一位 所以漏了一位
	strRet[1] = step + '0';
	return strRet;
}

bool BigData::IsLeftBig(char*& Pleft, int& LSize, char*&Pright, int & RSize)
{
	if (LSize > RSize || (LSize == RSize && strncmp(Pleft, Pright, LSize) == 0))
		return true;
	return false;
}

char BigData::SubLoop(char*&Pleft, int&Lsize, char*&Pright, int&Rsize)
{
	//除法也相当于连续的减法
	//count为除了几次
	char count = '0';//相当于商值	
	while (IsLeftBig(Pleft, Lsize, Pright, Rsize))
	{
		for (size_t i = 0; i < Lsize; ++i)
		{
			char temp = Pleft[Lsize - 1 - i] - '0';
			if (i < Rsize)
				temp -= (Pright[Rsize - 1 - i] - '0');
			if (temp < 0)
			{
				//向前借位
				size_t step = 1;//借的步数
				//如果进位数加i小于Lsize 并且 Pleft的倒数位数为0 则需要把这一位置为9 并把进位数+1
				while ((1 + i + step < Lsize) && Pleft[Lsize - 1 - i - step] == 0)
				{
					Pleft[Lsize - 1 - i - step] = '9';
					step++;
				}
				//当前位减减
				Pleft[Lsize - 1 - i - step]--;
				temp += 10;
			}
			Pleft[Lsize - 1 - i] = temp + '0';
		}
		count++;
		while (Lsize > 0 && *Pleft == '0') //去除前面的0
		{
			Pleft++;
			Lsize--;
		}
	}
	return count;
}

bool BigData::IsINT64OverFlow() const
{
	std::string strTemp = "+9223372036854775807";
	if (_strData[0] == '-')
		strTemp = "-9223372036854775808";

	if (_strData.size() < strTemp.size())
		return false;
	else if (_strData.size() == strTemp.size() && _strData > strTemp)
		return false;
	return true;
}

BigData.h

#pragma once
#include <string>
#include <iostream>
#include<windows.h>
#include <stdlib.h>
using namespace std;

typedef long long INT64;

struct BigData
{
public:
	BigData();
	BigData(INT64 value);
	BigData(const std::string& strData);

	BigData operator+(const BigData& b); 
	BigData operator-(BigData& b);
	BigData operator*(BigData& b);
	BigData operator/(const BigData& b);

	friend std::ostream & operator<<(std::ostream & _cout, const BigData & b)
	{
		char *pData = (char*)b._strData.c_str();
		if ('+' == *pData)
			pData++;
		_cout << pData;
		return _cout;
	}

private:
	std::string Add(std::string left,std::string right);
	std::string Sub(std::string left, std::string right);
	std::string Mul(std::string left, std::string right);
	std::string Div(std::string left, std::string right);
	bool IsINT64OverFlow()const;
	char SubLoop(char*&Pleft, int&Lsize, char*&Pright, int&Rsize);
	bool IsLeftBig(char*& Pleft, int& LSize, char*&Pright, int & RSize);

private:
	INT64 _value;
	std::string _strData;
};

sayhello_world

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
【面试题】大数据计算

大数据存储计算例题：有一组数字，这组共有10亿个数字，数字位数为1-64位，求出这组数字的平均值。那么这个数字之和可能是超过的范围的，longlong的范围是（-9223372036854775808~9223372036854775807），很有可能数字已经越界，那么该怎么办？思路：用字符数组存储越界的数字，在字符数组内进行加减乘除。然后返回字符数组即可
复制链接

扫一扫