关于CharSequence,字符串等相等问题

转自:《http://feifei-lee.iteye.com/blog/1163029》


判断两个东东是否相同,习惯性的用“==”,在付出了惨重的代价后才明白:

  1,“==”在java中,对比的是对象的内存地址,只有int,short,long等数值型类型可以用。当然,判断是否为null也可以用。(像字符串是对象,就不能用“==”,编译也可以通过,但是结果是错误的)。
  2,CharSequence不能直接进行相等的判断,要转换成String类型。通过CharSequence.toString():

  3,String类型的判断用equals()方法。 String1.equals(String2);


CharSequence VS String in Java?

转自:《http://stackoverflow.com/questions/1049228/charsequence-vs-string-in-java》

Programming in Android, most of the text values are expected in CharSequence.

Why is that ? What is the benefit and what are the main impacts of using CharSequence over String ?

What are the main differences, and what issues are expected, while using them, and converting from one to another ?

Answer:
1. Strings are CharSequences, so you can just use Strings and not worry. Android is merely trying to be helpful by allowing you to also specify other CharSequence objects, like StringBuffers.

  • Except when Android passes me a CharSequence in a callback and I need a String - call charSeq.toString(). – Martin Konicek Jul 7 '11 at 11:09  
  • But keep in mind this caveat from the CharSequence javadoc: This interface does not refine the general contracts of the equals and hashCode methods. The result of comparing two objects that implement CharSequence is therefore, in general, undefined. Each object may be implemented by a different class, and there is no guarantee that each class will be capable of testing its instances for equality with those of the other. It is therefore inappropriate to use arbitrary CharSequence instances as elements in a set or as keys in a map. – Trevor Robinson Feb 10 '12 at 23:39


2. In general using an interface allows you to vary the implementation with minimal collateral damage. Although java.lang.String are super popular it may be possible that in certain contexts one may want to use another implementation. By building the API around CharSequences rather than Strings the code gives one the opportunity to do that.


3. I believe it is best to use CharSequence. The reason is that String implements CharSequence, therefore you can pass a String into a CharSequence, HOWEVER you cannot pass a CharSequence into a String, as CharSequence doesn't not implement String. ALSO, in Android the EditText.getText() method returns an Editable, which also implements CharSequence and can be passed easily into one, while not easily into a String. CharSequence handles all!


4. This is almost certainly performance reasons. For example, imagine a parser that goes through a 500k ByteBuffer containing strings.

There are 3 approaches to returning the string content: 1. Build a String[] at parse time, one character at a time. This will take a noticeable amount of time. We can use == instead of .equals to compare cached references.

Build a int[] with offsets at parse time, then dynamically build String when a get() happens. Each String will be a new object, so no caching returned values and using ==

Build a CharSequence[] at parse time. Since no new data is stored (other than offsets into the byte buffer), the parsing is much lower that #1. At get time, we don't need to build a String, so get performance is equal to #1 (much better than #2), as we're only returning a reference to an existing object.

In addition to the processing gains you get using CharSequence, you also reduce the memory footprint by not duplicating data. For example, if you have a buffer containing 3 paragraphs of text, and want to return either all 3 or a single paragraph, you need 4 Strings to represent this. Using CharSequence you only need 1 buffer with the data, and 4 instances of a CharSequence implementation that tracks the start and length.

Phil Lello


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 使用 Java 语言计算两个字符串的相似度可以使用字符串比较算法,例如 Levenshtein 距离算法。 Levenshtein 距离算法是一种用于计算两个字符串之间的编辑距离的算法。编辑距离指的是将一个字符串转换为另一个字符串所需的最少编辑操作次数,例如插入、删除和替换。 要实现 Levenshtein 距离算法,需要定义一个函数,该函数接受两个字符串作为参数,并返回它们之间的编辑距离。这里是一个使用递归的方法来实现 Levenshtein 距离算法的示例: ``` public static int calculateLevenshteinDistance(CharSequence lhs, CharSequence rhs) { int[][] distance = new int[lhs.length() + 1][rhs.length() + 1]; for (int i = 0; i <= lhs.length(); i++) distance[i][0] = i; for (int j = 1; j <= rhs.length(); j++) distance[0][j] = j; for (int i = 1; i <= lhs.length(); i++) for (int j = 1; j <= rhs.length(); j++) distance[i][j] = minimum( distance[i - 1][j] + 1, distance[i][j - 1] + 1, distance[i - 1][j - 1] + ((lhs.charAt(i - 1) == rhs.charAt(j - 1)) ? 0 : 1)); return distance[lhs.length()][rhs.length()]; } private static int minimum(int a, int b, int c) { return Math.min(Math.min(a, b), c); } ``` 调用该函数并传入两个字符串即可得到它们之间的编辑距离。要计算两个字符 ### 回答2: 在Java语言中,可以使用字符串相似度算法来计算两个字符串之间的相似度。以下是一种常用的方法: 一、使用Levenshtein距离计算相似度: Levenshtein距离是通过对比两个字符串之间的差异来计算相似度的一种方法。可以通过以下步骤来实现: 1. 创建一个二维数组,大小为[m+1][n+1],其中m和n分别为两个字符串的长度。 2. 初始化数组的第一行和第一列为0到m和0到n的数字。 3. 迭代数组的每个元素,如果当前字符相同,则该位置的值等于左上角位置的值;否则,该位置的值等于左上角位置的值加1。 4. 最后,数组的右下角元素的值就是Levenshtein距离,通过1减去该值再除以较长字符串的长度,即可得到相似度。 二、使用Cosine相似度计算相似度: Cosine相似度是通过计算两个字符串之间的余弦夹角来衡量它们的相似度的一种方法。可以通过以下步骤来实现: 1. 将两个字符串分别转换成向量空间模型(Vector Space Model)的表示形式。 2. 对两个向量进行归一化处理,即将每个向量的每个分量除以该向量的模长。 3. 计算两个向量之间的内积,并除以两个向量的模长的乘积。 4. 最后,得到的结果即为两个字符串之间的Cosine相似度。 以上是使用Java语言计算两个字符串相似度的两种常见方法,根据具体需求和使用场景,选择合适的方法来进行计算。 ### 回答3: 使用Java语言可以通过计算两个字符串的相似度来衡量它们之间的相似程度。以下是一种比较常见的字符串相似度计算方法——Levenshtein距离算法。 Levenshtein距离是一种编辑距离,用于度量两个字符串之间的差异。通过对两个字符串进行插入、删除和替换操作,使得两个字符串相等所需的最少操作次数即为Levenshtein距离。 我们可以使用动态规划的思想来实现Levenshtein距离的计算。定义一个二维数组dp,dp[i][j]表示将字符串s1的前i个字符转换为字符串s2的前j个字符所需要的最少操作次数。初始化时,dp[i][0] = i,dp[0][j] = j,代表将一个字符串转换为空字符串所需的操作次数。 接下来,从字符串的第一个字符开始比较,若两个字符相同,则dp[i][j] = dp[i-1][j-1];否则,dp[i][j] = min(dp[i-1][j-1], dp[i][j-1], dp[i-1][j]) + 1,分别代表替换、插入和删除操作。 最终,dp[m][n](m和n分别为两个字符串的长度)即为两个字符串的Levenshtein距离。为了得到相似度,可以计算1 - Levenshtein距离/Max(字符串1长度,字符串2长度)。 通过编写Java代码实现上述算法,我们可以方便地计算出两个字符串的相似度。这样可以应用于文本的相似度分析、搜索引擎关键词匹配等应用场景中。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值