2011-09-23 64 views
18
String input = "THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A LEGAL AND BINDING AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing your use of this site, www.nationalgeographic.com, which includes but is not limited to products, software and services offered by way of the website such as the Video Player, Uploader, and other applications that link to these Terms (the Site). Please review the Terms fully before you continue to use the Site. By using the Site, you agree to be bound by the Terms. You shall also be subject to any additional terms posted with respect to individual sections of the Site. Please review our Privacy Policy, which also governs your use of the Site, to understand our practices. If you do not agree, please discontinue using the Site. National Geographic reserves the right to change the Terms at any time without prior notice. Your continued access or use of the Site after such changes indicates your acceptance of the Terms as modified. It is your responsibility to review the Terms regularly. The Terms were last updated on 18 July 2011."; 

//text copied from http://www.nationalgeographic.com/community/terms/ 

我想将此大字符串拆分为行,并且每行中的行不应超过MAX_LINE_LENGTH个字符。在java中分割成最大长度的行中的大字符串

我试过到目前为止

int MAX_LINE_LENGTH = 20;  
System.out.print(Arrays.toString(input.split("(?<=\\G.{MAX_LINE_LENGTH})"))); 
//maximum length of line 20 characters 

输出:

[THESE TERMS AND COND, ITIONS OF SERVICE (t, he Terms) ARE A LEGA, L AND B ... 

它会导致破字的。我不想要这个。 而不是我想要得到的输出是这样的:

[THESE TERMS AND , CONDITIONS OF , SERVICE (the Terms) , ARE A LEGAL AND B ... 

一个条件补充说: 如果一个字长度大于MAX_LINE_LENGTH大于这个词应该得到分流。

解决方案应该没有外部罐子的帮助。

+3

可能重复http://stackoverflow.com/questions/4055430/java-code-for-wrapping-text-lines-to-a -max-line-width) – hammar

+0

@hammer - 我的客户不希望我使用任何外部jar文件。没有任何外部jar文件,我没有在你提到的那个线程中得到任何解决方案。 – Abhishek

+0

是啊......就这样做了。 – Abhishek

回答

22

只要逐字地遍历字符串,并且每当单词超过限制时就会中断。

public String addLinebreaks(String input, int maxLineLength) { 
    StringTokenizer tok = new StringTokenizer(input, " "); 
    StringBuilder output = new StringBuilder(input.length()); 
    int lineLen = 0; 
    while (tok.hasMoreTokens()) { 
     String word = tok.nextToken(); 

     if (lineLen + word.length() > maxLineLength) { 
      output.append("\n"); 
      lineLen = 0; 
     } 
     output.append(word); 
     lineLen += word.length(); 
    } 
    return output.toString(); 
} 

我只是在徒手输入,你可能不得不推动和prod一点,使其编译。

错误:如果输入中的单词长于maxLineLength,它将被追加到当前行而不是自己的太长行上。我假设你的行长是80或120个字符,在这种情况下,这不太可能是个问题。

+0

没有。我们也需要修复这个bug。导致我的max_line_length是30.我的行可能会包含文件名也可能超过30.在这种情况下,我们需要打破这个词。 – Abhishek

+0

我刚刚确认文件名不会超过15个字符。所以欢呼朋友! \ m/ – Abhishek

+8

我刚刚在您的代码中更改了一个部分 'String word = tok.nextToken()+“”;' – Abhishek

1

我最近写了一些方法来做到这一点,如果在其中一行中没有空格字符,请在求助于中间字拆分之前选择拆分其他非字母数字字符。

这里是如何变成了对我来说:

(采用lastIndexOfRegex()方法我张贴here

/** 
* Indicates that a String search operation yielded no results. 
*/ 
public static final int NOT_FOUND = -1; 



/** 
* Version of lastIndexOf that uses regular expressions for searching. 
* By Tomer Godinger. 
* 
* @param str String in which to search for the pattern. 
* @param toFind Pattern to locate. 
* @return The index of the requested pattern, if found; NOT_FOUND (-1) otherwise. 
*/ 
public static int lastIndexOfRegex(String str, String toFind) 
{ 
    Pattern pattern = Pattern.compile(toFind); 
    Matcher matcher = pattern.matcher(str); 

    // Default to the NOT_FOUND constant 
    int lastIndex = NOT_FOUND; 

    // Search for the given pattern 
    while (matcher.find()) 
    { 
     lastIndex = matcher.start(); 
    } 

    return lastIndex; 
} 

/** 
* Finds the last index of the given regular expression pattern in the given string, 
* starting from the given index (and conceptually going backwards). 
* By Tomer Godinger. 
* 
* @param str String in which to search for the pattern. 
* @param toFind Pattern to locate. 
* @param fromIndex Maximum allowed index. 
* @return The index of the requested pattern, if found; NOT_FOUND (-1) otherwise. 
*/ 
public static int lastIndexOfRegex(String str, String toFind, int fromIndex) 
{ 
    // Limit the search by searching on a suitable substring 
    return lastIndexOfRegex(str.substring(0, fromIndex), toFind); 
} 

/** 
* Breaks the given string into lines as best possible, each of which no longer than 
* <code>maxLength</code> characters. 
* By Tomer Godinger. 
* 
* @param str The string to break into lines. 
* @param maxLength Maximum length of each line. 
* @param newLineString The string to use for line breaking. 
* @return The resulting multi-line string. 
*/ 
public static String breakStringToLines(String str, int maxLength, String newLineString) 
{ 
    StringBuilder result = new StringBuilder(); 
    while (str.length() > maxLength) 
    { 
     // Attempt to break on whitespace first, 
     int breakingIndex = lastIndexOfRegex(str, "\\s", maxLength); 

     // Then on other non-alphanumeric characters, 
     if (breakingIndex == NOT_FOUND) breakingIndex = lastIndexOfRegex(str, "[^a-zA-Z0-9]", maxLength); 

     // And if all else fails, break in the middle of the word 
     if (breakingIndex == NOT_FOUND) breakingIndex = maxLength; 

     // Append each prepared line to the builder 
     result.append(str.substring(0, breakingIndex + 1)); 
     result.append(newLineString); 

     // And start the next line 
     str = str.substring(breakingIndex + 1); 
    } 

    // Check if there are any residual characters left 
    if (str.length() > 0) 
    { 
     result.append(str); 
    } 

    // Return the resulting string 
    return result.toString(); 
} 
6

感谢副把Garvelink你的答案。我已经修改了上面的代码来修复错误 :“如果在输入的词比maxCharInLine长”

public String[] splitIntoLine(String input, int maxCharInLine){ 

    StringTokenizer tok = new StringTokenizer(input, " "); 
    StringBuilder output = new StringBuilder(input.length()); 
    int lineLen = 0; 
    while (tok.hasMoreTokens()) { 
     String word = tok.nextToken(); 

     while(word.length() > maxCharInLine){ 
      output.append(word.substring(0, maxCharInLine-lineLen) + "\n"); 
      word = word.substring(maxCharInLine-lineLen); 
      lineLen = 0; 
     } 

     if (lineLen + word.length() > maxCharInLine) { 
      output.append("\n"); 
      lineLen = 0; 
     } 
     output.append(word + " "); 

     lineLen += word.length() + 1; 
    } 
    // output.split(); 
    // return output.toString(); 
    return output.toString().split("\n"); 
} 
+0

你应该使用output.append(word).append(“”); – clic

4

从@Barend的建议开始,以下是我的最终版本稍作修改:

private static final char NEWLINE = '\n'; 
private static final String SPACE_SEPARATOR = " "; 
//if text has \n, \r or \t symbols it's better to split by \s+ 
private static final String SPLIT_REGEXP= "\\s+"; 

public static String breakLines(String input, int maxLineLength) { 
    String[] tokens = input.split(SPLIT_REGEXP); 
    StringBuilder output = new StringBuilder(input.length()); 
    int lineLen = 0; 
    for (int i = 0; i < tokens.length; i++) { 
     String word = tokens[i]; 

     if (lineLen + (SPACE_SEPARATOR + word).length() > maxLineLength) { 
      if (i > 0) { 
       output.append(NEWLINE); 
      } 
      lineLen = 0; 
     } 
     if (i < tokens.length - 1 && (lineLen + (word + SPACE_SEPARATOR).length() + tokens[i + 1].length() <= 
       maxLineLength)) { 
      word += SPACE_SEPARATOR; 
     } 
     output.append(word); 
     lineLen += word.length(); 
    } 
    return output.toString(); 
} 

System.out.println(breakLines("THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A  LEGAL AND BINDING " + 
       "AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing  your use of this site, " + 
      "www.nationalgeographic.com, which includes but is not limited to products, " + 
      "software and services offered by way of the website such as the Video Player.", 20)); 

输出:

THESE TERMS AND 
CONDITIONS OF 
SERVICE (the Terms) 
ARE A LEGAL AND 
BINDING AGREEMENT 
BETWEEN YOU AND 
NATIONAL GEOGRAPHIC 
governing your use 
of this site, 
www.nationalgeographic.com, 
which includes but 
is not limited to 
products, software 
and services 
offered by way of 
the website such as 
the Video Player. 
9

最佳:使用Apache Commons Lang中:

个org.apache.commons.lang.WordUtils

/** 
* <p>Wraps a single line of text, identifying words by <code>' '</code>.</p> 
* 
* <p>New lines will be separated by the system property line separator. 
* Very long words, such as URLs will <i>not</i> be wrapped.</p> 
* 
* <p>Leading spaces on a new line are stripped. 
* Trailing spaces are not stripped.</p> 
* 
* <pre> 
* WordUtils.wrap(null, *) = null 
* WordUtils.wrap("", *) = "" 
* </pre> 
* 
* @param str the String to be word wrapped, may be null 
* @param wrapLength the column to wrap the words at, less than 1 is treated as 1 
* @return a line with newlines inserted, <code>null</code> if null input 
*/ 
public static String wrap(String str, int wrapLength) { 
    return wrap(str, wrapLength, null, false); 
} 
5

您可以使用WordUtils。阿帕奇Commans郎的包装方法

import java.util.*; 
import org.apache.commons.lang3.text.WordUtils; 
public class test3 { 


public static void main(String[] args) { 

    String S = "THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A LEGAL AND BINDING AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing your use of this site, www.nationalgeographic.com, which includes but is not limited to products, software and services offered by way of the website such as the Video Player, Uploader, and other applications that link to these Terms (the Site). Please review the Terms fully before you continue to use the Site. By using the Site, you agree to be bound by the Terms. You shall also be subject to any additional terms posted with respect to individual sections of the Site. Please review our Privacy Policy, which also governs your use of the Site, to understand our practices. If you do not agree, please discontinue using the Site. National Geographic reserves the right to change the Terms at any time without prior notice. Your continued access or use of the Site after such changes indicates your acceptance of the Terms as modified. It is your responsibility to review the Terms regularly. The Terms were last updated on 18 July 2011."; 
    String F = WordUtils.wrap(S, 20); 
    String[] F1 = F.split(System.lineSeparator()); 
    System.out.println(Arrays.toString(F1)); 

}} 

输出

[THESE TERMS AND, CONDITIONS OF, SERVICE (the Terms), ARE A LEGAL AND, BINDING AGREEMENT, BETWEEN YOU AND, NATIONAL GEOGRAPHIC, governing your use, of this site,, www.nationalgeographic.com,, which includes but, is not limited to, products, software, and services offered, by way of the, website such as the, Video Player,, Uploader, and other, applications that, link to these Terms, (the Site). Please, review the Terms, fully before you, continue to use the, Site. By using the, Site, you agree to, be bound by the, Terms. You shall, also be subject to, any additional terms, posted with respect, to individual, sections of the, Site. Please review, our Privacy Policy,, which also governs, your use of the, Site, to understand, our practices. If, you do not agree,, please discontinue, using the Site., National Geographic, reserves the right, to change the Terms, at any time without, prior notice. Your, continued access or, use of the Site, after such changes, indicates your, acceptance of the, Terms as modified., It is your, responsibility to, review the Terms, regularly. The Terms, were last updated on, 18 July 2011.] 
1

我的版本(以前没有工作)

public static List<String> breakSentenceSmart(String text, int maxWidth) { 

    StringTokenizer stringTokenizer = new StringTokenizer(text, " "); 
    List<String> lines = new ArrayList<String>(); 
    StringBuilder currLine = new StringBuilder(); 
    while (stringTokenizer.hasMoreTokens()) { 
     String word = stringTokenizer.nextToken(); 

     boolean wordPut=false; 
     while (!wordPut) { 
      if(currLine.length()+word.length()==maxWidth) { //exactly fits -> dont add the space 
       currLine.append(word); 
       wordPut=true; 
      } 
      else if(currLine.length()+word.length()<=maxWidth) { //whole word can be put 
       if(stringTokenizer.hasMoreTokens()) { 
        currLine.append(word + " "); 
       }else{ 
        currLine.append(word); 
       } 
       wordPut=true; 
      }else{ 
       if(word.length()>maxWidth) { 
        int lineLengthLeft = maxWidth - currLine.length(); 
        String firstWordPart = word.substring(0, lineLengthLeft); 
        currLine.append(firstWordPart); 
        //lines.add(currLine.toString()); 
        word = word.substring(lineLengthLeft); 
        //currLine = new StringBuilder(); 
       } 
       lines.add(currLine.toString()); 
       currLine = new StringBuilder(); 
      } 

     } 
     // 
    } 
    if(currLine.length()>0) { //add whats left 
     lines.add(currLine.toString()); 
    } 
    return lines; 
} 
1

由于的Java 8你也可以使用到解决这些问题。

以下你可以找到一个完整的例子,它利用Reduction using the .collect() method

我觉得这个应该比其他非第三方解决方案短。

private static String multiLine(String longString, String splitter, int maxLength) { 
    return Arrays.stream(longString.split(splitter)) 
      .collect(
       ArrayList<String>::new,  
       (l, s) -> { 
        Function<ArrayList<String>, Integer> id = list -> list.size() - 1; 
        if(l.size() == 0 || (l.get(id.apply(l)).length() != 0 && l.get(id.apply(l)).length() + s.length() >= maxLength)) l.add(""); 
        l.set(id.apply(l), l.get(id.apply(l)) + (l.get(id.apply(l)).length() == 0 ? "" : splitter) + s); 
       }, 
       (l1, l2) -> l1.addAll(l2)) 
      .stream().reduce((s1, s2) -> s1 + "\n" + s2).get(); 
} 

public static void main(String[] args) { 
    String longString = "THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A LEGAL AND BINDING AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing your use of this site, www.nationalgeographic.com, which includes but is not limited to products, software and services offered by way of the website such as the Video Player, Uploader, and other applications that link to these Terms (the Site). Please review the Terms fully before you continue to use the Site. By using the Site, you agree to be bound by the Terms. You shall also be subject to any additional terms posted with respect to individual sections of the Site. Please review our Privacy Policy, which also governs your use of the Site, to understand our practices. If you do not agree, please discontinue using the Site. National Geographic reserves the right to change the Terms at any time without prior notice. Your continued access or use of the Site after such changes indicates your acceptance of the Terms as modified. It is your responsibility to review the Terms regularly. The Terms were last updated on 18 July 2011."; 
    String SPLITTER = " "; 
    int MAX_LENGTH = 20; 
    System.out.println(multiLine(longString, SPLITTER, MAX_LENGTH)); 
} 
的[用于包装文本行的最大线宽度的Java代码(
相关问题