R语言近似字符串匹配(模糊匹配)agrep()函数-中英文对照帮助文档
By MicroRbt Martinez PhD
R语言函数名:agrep()
R语言函数功能:近似字符串匹配(模糊匹配)
来自资源库:基础库(R语言自带)
agrep()函数所属R语言包:所在R包具体名称、包功能的中英文双语描述见正文后面'--所在R语言包信息--'部分。
描述-----Description-----
Searches for approximate matches to pattern (the first argument) within each element of the string x (the second argument) using the generalized Levenshtein edit distance (the minimal possibly weighted number of insertions, deletions and substitutions needed to transform one string into another).
使用广义的Levenshtein编辑距离(将一个字符串转换为另一个字符串所需的最小加权插入,删除和替换次数),在字符串x(第二个参数)的每个元素内搜索与patternc(第一个参数)的近似匹配。
使用方法-----Usage-----
agrep(pattern, x, max.distance = 0.1, costs = NULL,
ignore.case = FALSE, value = FALSE, fixed = TRUE,
useBytes = FALSE)
agrepl(pattern, x, max.distance = 0.1, costs = NULL,
ignore.case = FALSE, fixed = TRUE, useBytes = FALSE)
参数-----Arguments-----
参数pattern介绍: a non-empty character string or a character string containing a regular expression (for fixed = FALSE) to be matched. Coerced by as.character to a string if possible.
非空字符串或包含要匹配的正则表达式的字符串(对于fixed = FALSE)。如果可能,由as.character强制为字符串。
参数x介绍: character vector where matches are sought. Coerced by as.character to a character vector if possible.
寻求匹配的字符向量。如果可能,由as.character强制为字符向量。
参数max.distance介绍: Maximum distance allowed for a match. Expressed either as integer, or as a fraction of the pattern length times the maximal transformation cost (will be replaced by the smallest integer not less than the corresponding fraction), or a list with possible components
比赛允许的最大距离。用整数或pattern长度乘以最大转换成本的分数表示(将由不小于相应分数的最小整数表示),或具有可能成分的列表
maximum number/fraction of match cost (generalized Levenshtein distance)
匹配费用的最大数量/分数(通用Levenshtein距离)
-----未完,待续-----,↓↓↓展开剩余72%↓↓↓
|