2009-12-14 94 views
19

在Java中,Character类中有isJavaIdentifierStartisJavaIdentifierPart方法,可用于判断字符串是否为有效的Java标识符,如下所示:C#中是否有方法检查字符串是否是有效的标识符

public boolean isJavaIdentifier(String s) { 
    int n = s.length(); 
    if (n==0) return false; 
    if (!Character.isJavaIdentifierStart(s.charAt(0))) 
     return false; 
    for (int i = 1; i < n; i++) 
     if (!Character.isJavaIdentifierPart(s.charAt(i))) 
      return false; 
    return true; 
} 

C#有没有这样的东西?

回答

6

基本上是这样的:

const string start = @"(\p{Lu}|\p{Ll}|\p{Lt}|\p{Lm}|\p{Lo}|\p{Nl})"; 
const string extend = @"(\p{Mn}|\p{Mc}|\p{Nd}|\p{Pc}|\p{Cf})"; 
Regex ident = new Regex(string.Format("{0}({0}|{1})*", start, extend)); 
s = s.Normalize(); 
return ident.IsMatch(s); 
+4

OMG 7 upvotes,它甚至不工作,甚至没有编译,直到我修复代码... – 2014-12-17 10:02:37

28

是:

// using System.CodeDom.Compiler; 
CodeDomProvider provider = CodeDomProvider.CreateProvider("C#"); 
if (provider.IsValidIdentifier (YOUR_VARIABLE_NAME)) { 
     // Valid 
} else { 
     // Not valid 
} 

从这里:How to determine if a string is a valid variable name?

+0

这确实有一些性能。你应该意识到的含义。请参阅我的帖子了解更多信息。 – 2009-12-14 23:54:37

8

我会警惕这里提供的其他解决方案。调用CodeDomProvider.CreateProvider需要查找和解析Machine.Config文件以及app.config文件。这可能比检查字符串的时间要慢几倍。

相反,我会主张你做出以下更改之一:

  1. 缓存在一个静态变量的提供商。

    这会导致您只创建一次,但会减慢类型加载。

  2. 通过创建一个Microsoft.CSharp.CSharpCodeProvider比如你的自我

    这将跳过配置文件解析一起直接创建提供商。

  3. 编写代码来实现检查你自己。

    如果你这样做,你可以最好地控制它的实现方式,如果你需要的话,它可以帮助你优化性能。有关C#标识符的完整词汇语法,请参阅C# language spec的2.2.4节。

3

最近,我写了一个验证字符串作为一个合法的C#标识符的扩展方法。

你可以找到这里的实施要点:https://gist.github.com/FabienDehopre/5245476

它基于标识符(http://msdn.microsoft.com/en-us/library/aa664670(v=vs.71).aspx)的MSDN文档上

public static bool IsValidIdentifier(this string identifier) 
{ 
    if (String.IsNullOrEmpty(identifier)) return false; 

    // C# keywords: http://msdn.microsoft.com/en-us/library/x53a06bb(v=vs.71).aspx 
    var keywords = new[] 
         { 
          "abstract", "event",  "new",  "struct", 
          "as",  "explicit", "null",  "switch", 
          "base",  "extern",  "object",  "this", 
          "bool",  "false",  "operator", "throw", 
          "breal",  "finally", "out",  "true", 
          "byte",  "fixed",  "override", "try", 
          "case",  "float",  "params",  "typeof", 
          "catch",  "for",  "private", "uint", 
          "char",  "foreach", "protected", "ulong", 
          "checked", "goto",  "public",  "unchekeced", 
          "class",  "if",   "readonly", "unsafe", 
          "const",  "implicit", "ref",  "ushort", 
          "continue", "in",   "return",  "using", 
          "decimal", "int",  "sbyte",  "virtual", 
          "default", "interface", "sealed",  "volatile", 
          "delegate", "internal", "short",  "void", 
          "do",  "is",   "sizeof",  "while", 
          "double", "lock",  "stackalloc", 
          "else",  "long",  "static", 
          "enum",  "namespace", "string" 
         }; 

    // definition of a valid C# identifier: http://msdn.microsoft.com/en-us/library/aa664670(v=vs.71).aspx 
    const string formattingCharacter = @"\p{Cf}"; 
    const string connectingCharacter = @"\p{Pc}"; 
    const string decimalDigitCharacter = @"\p{Nd}"; 
    const string combiningCharacter = @"\p{Mn}|\p{Mc}"; 
    const string letterCharacter = @"\p{Lu}|\p{Ll}|\p{Lt}|\p{Lm}|\p{Lo}|\p{Nl}"; 
    const string identifierPartCharacter = letterCharacter + "|" + 
              decimalDigitCharacter + "|" + 
              connectingCharacter + "|" + 
              combiningCharacter + "|" + 
              formattingCharacter; 
    const string identifierPartCharacters = "(" + identifierPartCharacter + ")+"; 
    const string identifierStartCharacter = "(" + letterCharacter + "|_)"; 
    const string identifierOrKeyword = identifierStartCharacter + "(" + 
             identifierPartCharacters + ")*"; 
    var validIdentifierRegex = new Regex("^" + identifierOrKeyword + "$", RegexOptions.Compiled); 
    var normalizedIdentifier = identifier.Normalize(); 

    // 1. check that the identifier match the validIdentifer regex and it's not a C# keyword 
    if (validIdentifierRegex.IsMatch(normalizedIdentifier) && !keywords.Contains(normalizedIdentifier)) 
    { 
     return true; 
    } 

    // 2. check if the identifier starts with @ 
    if (normalizedIdentifier.StartsWith("@") && validIdentifierRegex.IsMatch(normalizedIdentifier.Substring(1))) 
    { 
     return true; 
    } 

    // 3. it's not a valid identifier 
    return false; 
} 
3

这里Necromancing。

In。NET核心/ DNX,你可以用罗斯林 - SyntaxFacts做

Microsoft.CodeAnalysis.CSharp.SyntaxFacts.IsReservedKeyword(
     Microsoft.CodeAnalysis.CSharp.SyntaxFacts.GetKeywordKind("protected") 
); 



foreach (ColumnDefinition cl in tableColumns) 
{ 
    sb.Append(@"   public "); 
    sb.Append(cl.DOTNET_TYPE); 
    sb.Append(" "); 

    // for keywords 
    //if (!Microsoft.CodeAnalysis.CSharp.SyntaxFacts.IsValidIdentifier(cl.COLUMN_NAME)) 
    if (Microsoft.CodeAnalysis.CSharp.SyntaxFacts.IsReservedKeyword(
     Microsoft.CodeAnalysis.CSharp.SyntaxFacts.GetKeywordKind(cl.COLUMN_NAME) 
     )) 
     sb.Append("@"); 

    sb.Append(cl.COLUMN_NAME); 
    sb.Append("; // "); 
    sb.AppendLine(cl.SQL_TYPE); 
} // Next cl 


还是在旧的变体用的CodeDOM - 看看在单源代码后:

CodeDomProvider.cs

public virtual bool IsValidIdentifier (string value) 
286   { 
287    ICodeGenerator cg = CreateGenerator(); 
288    if (cg == null) 
289     throw GetNotImplemented(); 
290    return cg.IsValidIdentifier (value); 
291   } 
292 

然后CSharpCodeProvider.cs

public override ICodeGenerator CreateGenerator() 
91  { 
92 #if NET_2_0 
93   if (providerOptions != null && providerOptions.Count > 0) 
94    return new Mono.CSharp.CSharpCodeGenerator (providerOptions); 
95 #endif 
96   return new Mono.CSharp.CSharpCodeGenerator(); 
97  } 

则C SharpCodeGenerator.cs

protected override bool IsValidIdentifier (string identifier) 
{ 
    if (identifier == null || identifier.Length == 0) 
     return false; 

    if (keywordsTable == null) 
     FillKeywordTable(); 

    if (keywordsTable.Contains (identifier)) 
     return false; 

    if (!is_identifier_start_character (identifier [0])) 
     return false; 

    for (int i = 1; i < identifier.Length; i ++) 
     if (! is_identifier_part_character (identifier [i])) 
      return false; 

    return true; 
} 



private static System.Collections.Hashtable keywordsTable; 
private static string[] keywords = new string[] { 
    "abstract","event","new","struct","as","explicit","null","switch","base","extern", 
    "this","false","operator","throw","break","finally","out","true", 
    "fixed","override","try","case","params","typeof","catch","for", 
    "private","foreach","protected","checked","goto","public", 
    "unchecked","class","if","readonly","unsafe","const","implicit","ref", 
    "continue","in","return","using","virtual","default", 
    "interface","sealed","volatile","delegate","internal","do","is", 
    "sizeof","while","lock","stackalloc","else","static","enum", 
    "namespace", 
    "object","bool","byte","float","uint","char","ulong","ushort", 
    "decimal","int","sbyte","short","double","long","string","void", 
    "partial", "yield", "where" 
}; 


static void FillKeywordTable() 
{ 
    lock (keywords) { 
     if (keywordsTable == null) { 
      keywordsTable = new Hashtable(); 
      foreach (string keyword in keywords) { 
       keywordsTable.Add (keyword, keyword); 
      } 
     } 
    } 
} 



static bool is_identifier_start_character (char c) 
{ 
    return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_' || c == '@' || Char.IsLetter (c); 
} 

static bool is_identifier_part_character (char c) 
{ 
    return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_' || (c >= '0' && c <= '9') || Char.IsLetter (c); 
} 

你得到这个代码:

public static bool IsValidIdentifier (string identifier) 
{ 
    if (identifier == null || identifier.Length == 0) 
     return false; 

    if (keywordsTable == null) 
     FillKeywordTable(); 

    if (keywordsTable.Contains(identifier)) 
     return false; 

    if (!is_identifier_start_character(identifier[0])) 
     return false; 

    for (int i = 1; i < identifier.Length; i++) 
     if (!is_identifier_part_character(identifier[i])) 
      return false; 

    return true; 
} 


internal static bool is_identifier_start_character(char c) 
{ 
    return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_' || c == '@' || char.IsLetter(c); 
} 

internal static bool is_identifier_part_character(char c) 
{ 
    return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_' || (c >= '0' && c <= '9') || char.IsLetter(c); 
} 


private static System.Collections.Hashtable keywordsTable; 
private static string[] keywords = new string[] { 
    "abstract","event","new","struct","as","explicit","null","switch","base","extern", 
    "this","false","operator","throw","break","finally","out","true", 
    "fixed","override","try","case","params","typeof","catch","for", 
    "private","foreach","protected","checked","goto","public", 
    "unchecked","class","if","readonly","unsafe","const","implicit","ref", 
    "continue","in","return","using","virtual","default", 
    "interface","sealed","volatile","delegate","internal","do","is", 
    "sizeof","while","lock","stackalloc","else","static","enum", 
    "namespace", 
    "object","bool","byte","float","uint","char","ulong","ushort", 
    "decimal","int","sbyte","short","double","long","string","void", 
    "partial", "yield", "where" 
}; 

internal static void FillKeywordTable() 
{ 
    lock (keywords) 
    { 
     if (keywordsTable == null) 
     { 
      keywordsTable = new System.Collections.Hashtable(); 
      foreach (string keyword in keywords) 
      { 
       keywordsTable.Add(keyword, keyword); 
      } 
     } 
    } 
} 
3

随着Roslyn是开源的,代码分析工具就在你的指尖,而且他们的表现写成。 (现在他们正在预发行)。

但是,我不能说加载程序集的性能成本。

使用的NuGet安装工具:

Install-Package Microsoft.CodeAnalysis -Pre 

提出您的问题:

var isValid = Microsoft.CodeAnalysis.CSharp.SyntaxFacts.IsValidIdentifier("I'mNotValid"); 
Console.WriteLine(isValid);  // False 
2

的现在发布Roslyn项目提供Microsoft.CodeAnalysis.CSharp.SyntaxFacts,与SyntaxFacts.IsIdentifierStartCharacter(char)SyntaxFacts.IsIdentifierPartCharacter(char)方法和Java一样。

这里使用的是一个简单的函数,我使用它将名词短语(例如“开始日期”)转换为C#标识符(例如“StartDate”)。 N.B我使用Humanizer做驼峰大小写转换,Roslyn用来检查一个字符是否有效。

public static string Identifier(string name) 
    { 
     Check.IsNotNullOrWhitespace(name, nameof(name)); 

     // trim off leading and trailing whitespace 
     name = name.Trim(); 

     // should deal with spaces => camel casing; 
     name = name.Dehumanize(); 

     var sb = new StringBuilder(); 
     if (!SyntaxFacts.IsIdentifierStartCharacter(name[0])) 
     { 
      // the first characters 
      sb.Append("_"); 
     } 

     foreach(var ch in name) 
     { 
      if (SyntaxFacts.IsIdentifierPartCharacter(ch)) 
      { 
       sb.Append(ch); 
      } 
     } 

     var result = sb.ToString(); 

     if (SyntaxFacts.GetKeywordKind(result) != SyntaxKind.None) 
     { 
      result = @"@" + result; 
     } 

     return result; 
    } 

测试;

[TestCase("Start Date", "StartDate")] 
    [TestCase("Bad*chars", "BadChars")] 
    [TestCase(" leading ws", "LeadingWs")] 
    [TestCase("trailing ws ", "TrailingWs")] 
    [TestCase("class", "Class")] 
    [TestCase("int", "Int")] 
    [Test] 
    public void CSharp_GeneratesDecentIdentifiers(string input, string expected) 
    { 
     Assert.AreEqual(expected, CSharp.Identifier(input)); 
    } 
+0

有用的事实,但没有帮助,你没有解释如何利用这个。我似乎找不到一个“Microsoft.CodeAnalysis”NuGet软件包,我似乎也找不到一个官方页面来解释可以在哪里获得该库。 – NightOwl888 2016-11-09 16:35:17

+0

我在第一句中提供了链接:https://github.com/dotnet/roslyn。它注意到:'nuget install Microsoft.CodeAnalysis#安装语言API和服务' – 2016-11-12 20:37:52

+0

您应该安装'Microsoft.CodeAnalysis.CSharp'来获得C#规则。 – 2016-11-12 20:50:44

相关问题