2011-09-26 50 views
5

有时可能需要在单独的一行中开始段落中的每个句子。例如,这可以更容易区分大型文本文档,因为一个句子的更改不会影响整个段落。一些标记系统(例如* roff)也需要每个句子从新行开始。有没有办法在句子末尾加上`fill-paragraph`?

有没有办法,例如通过明智地重新定义paragraph-separateparagraph-start,使fill-paragraph在句子之间停止?

(注意:我使用Emacs 23.3.1)


更新:样品MDOC(的* roff)标记:这

The 
.Nm 
utility makes a series of passes with increasing block sizes. 
In each pass, it either reads or writes (or both) a number of 
non-consecutive blocks at increasing offsets relative to the ideal 
alignment, which is assumed to be multiples of the block size. 
The results are presented in terms of time elapsed, transactions per 
second and kB per second. 

这是三句话一个段落,每即使在上一行的第一个单词有空间,也可以单独开始。目前,fill-paragraph将转换成

The 
.Nm 
utility makes a series of passes with increasing block sizes. In each 
pass, it either reads or writes (or both) a number of non-consecutive 
blocks at increasing offsets relative to the ideal alignment, which is 
assumed to be multiples of the block size. The results are presented 
in terms of time elapsed, transactions per second and kB per second. 

这是我想要避免的。


更新:在重新句子和段落

我看到,我的问题是有点不清楚,因为我使用的术语“段落”是什么最终双方参考一下Emacs的调用一个段落,并作为我使用的任何处理器(groff,latex等)输出中的连续文本块。澄清,

  • 我需要保持句子在一起没有任何空白行之间; groff不喜欢空行,而latex则将它们视为段落分隔符。
  • 我需要fill-paragraph对单个句子进行操作,即我想将段落重新定义为在空白行或上一段落结束后开始的段落,并以句号结尾,后跟换行符或至少两个空白字符。
  • 我很想有fill-paragraph将文本块分割成单独的句子,但我不认为这可以轻松完成。

举例来说,如果我键入以下内容:

The 
.Nm 
utility makes a series of passes with increasing block sizes. 
In each pass, it either reads or writes (or both) a number of non-consecutive blocks at increasing offsets relative to the ideal alignment, which is assumed to be multiples of the block size. 
The results are presented in terms of time elapsed, transactions per second and kB per second. 

然后移动光标到以“在每一个通”开始的行,按M-q,我应该得到以下几点:

The 
.Nm 
utility makes a series of passes with increasing block sizes. 
In each pass, it either reads or writes (or both) a number of 
non-consecutive blocks at increasing offsets relative to the ideal 
alignment, which is assumed to be multiples of the block size. 
The results are presented in terms of time elapsed, transactions per second and kB per second. 

请注意,最后一句话是不变的。

回答

1

如何告诉paragraph-start寻找以大写字母开头的行:

"\f\\|[  ]*$\\|^[A-Z]" 

请注意,新的部分是\\^[A-Z]

这应该在大多数情况下工作,你只需要注意少数情况下,你有一个资本中期的句子,并且这句话恰好足以在中期句子之前打破。

编辑:你可能要考虑压痕太:

"\f\\|[  ]*$\\|^[ ]*[A-Z]" 

方括号之间的空间包含空格和标签。

编辑:你需要关闭case-fold-search这个工作,否则大写和小写字母不区分匹配!

编辑:如果你想关闭只是此功能的案例折叠搜索,请将以下内容绑定到M-q(您可以在本地或全局范围内执行,如果您认为合适)。

(defun my-fill-paragraph() 
    (interactive) 
    (let ((case-fold-search nil)) 
    (fill-paragraph))) 
+0

嗯,如果Emacs的regexp语法支持零宽度可变长度的后顾断言,应该可以构造一个对应于Perl regexp的正则表达式。(?:\ n | \ s {2, })\ K \ S/s',它与第一个非空白字符后跟一个换行符或至少两个空白字符。 – DES

+0

你已经失去了我。我发布的正则表达式何时失败?我假设您在输入每个句子后手动按回车,以便Emacs不必为您分割段落,只需在填充时尊重句子边界。 – Tyler

+0

我不认为Emacs支持断言。 – Tyler

1

这是DTRT吗?

(defun separate-sentences (&optional beg end) 
    "ensure each sentence ends with a new line. 
When no region specified, use current paragraph." 
    (interactive (when (use-region-p) 
        (list (region-beginning) (region-end)))) 
    (unless (and beg end) 
    (save-excursion 
     (forward-paragraph -1) 
     (setq beg (point)) 
     (forward-paragraph 1) 
     (setq end (point)))) 
    (setq end (if (markerp end) 
       end 
       (set-marker (make-marker) end))) 
    (save-excursion 
    (goto-char beg) 
    (while (re-search-forward (sentence-end) end t) 
     (unless (or (looking-at-p "[ \t]*$") 
        (looking-back "^[ \t]*")) 
     (insert "\n"))))) 

(defun fill-paragraph-sentence-groups (justify) 
    "Groups of sentences filled together. A sentence ending with newline marks end of group." 
    (save-excursion 
    (save-restriction 
     (narrow-to-region (progn (forward-paragraph -1) (point)) 
         (progn (forward-paragraph 1) (point))) 
     (goto-char (point-min)) 
     (skip-chars-forward " \t\n") 
     (while (not (or (looking-at-p paragraph-separate) 
         (eobp))) 
     (fill-region (point) 
        (progn 
         (loop do (forward-sentence 1) 
          until (looking-at "[ \t]*$")) 
         (point)) 
        justify) 
     (unless (looking-back "^[ \t]*") 
      (forward-line 1))) 
     t))) 

(defun fill-paragraph-sentence-individual (justify) 
    "Each sentence in paragraph is put on new line." 
    (save-excursion 
    (separate-sentences) 
    (fill-paragraph-sentence-groups justify))) 

;; deployment option 1: add to major-mode hook 

(add-hook 'text-mode-hook (lambda() 
          (set (make-local-variable fill-paragraph-function) 'fill-paragraph-sentence-individual))) 

;; deployment option 2: call my-fill-paragraph any where 

(defun my-fill-paragraph (arg) 
    (interactive "*P") 
    (let ((fill-paragraph-function 'fill-paragraph-sentence-individual)) 
    (fill-paragraph arg))) 

上面给出了两段填充函数。一组句子,不会在新行结束。另一个将每个句子分解成新的一行。

我只显示如何部署个人,因为这是OP想要的。如果您愿意,请按照模型部署群组版本。

+0

问题不在于如何将一个段落分成几个句子,而是如何防止“填充段落”重新加入它们。 – DES

+0

这个问题不是很清楚。你能添加一些你想要的例子吗?你是否也想保留当前的段落移动行为? –

+0

添加了示例标记。 – DES

0

您可以使用fill-region,这并不令人惊讶,只填充当前区域。基于此,您可以定义fill-sentence函数。我想,以检测这样的句子用简单的方法就是:

  • 如果该行与.?,或!结束,它的结束句的线。

  • 如果行的前一行是空行或句尾结束行,则该行开始句子。

尽管如此,让它在所有情况下都能正常工作是相当棘手的。

+0

我希望能够做出“填充段落”DTRT的参数组合。 – DES

相关问题