2011-06-19 28 views
0

几天前我写了一段代码作为我面试过程的一部分。如果在给定的文本中存在查询文本,则给出问题文本。我使用散列表来保存给定的文本(键是文本中的词,值是文本中该词的位置)。因此,现在给出查询字符串,我可以找到文本中出现的单词的位置,并显示其中包含最大查询单词的文本的片段。我认为一切都很好,直到现在。单元测试设计

但我还被要求为它编写单元测试。虽然我以前从未写过单元测试,但我知道为什么他们在开发过程中非常需要。所以我创建了一些测试用例,记住了平均测试用例和边界用例。但是我不清楚的是,为了编写测试用例,我们需要事先知道正确的输出。

我开始获取一些文本输入到程序和相应的输出,把它们放在一个类中,稍后读取它们作为我的程序的输入。

单元测试代码如下所示:

import unittest 
import random 
import generate_random_test 
class c_Known_Output(): 
Input_Text1 = '''We ordered the traditional deep dish pizza and a Manchego salad. Were started off with a complimentary bread, that looks like a really big hamburger bun top at first glance. Even though it was free bread, it was soft and slightly sweet and delicious. I liked dipping it in the balsamic reduction and olive oil from the salad plate. The salad dish was perfectly fine, but I wish the Manchego slices on top were somehow sliced a bit thinner. The deep dish traditional pizza came out a bit later (remember the 40 min. cooking time, folks!), piping hot and smelling delicious. At first bite, I wasnt sure how much I liked it.''' 

Output_Text1 = '''I liked dipping it in the balsamic reduction and olive oil from the salad plate. The salad [[HIGHLIGHT]]dish[[ENDHIGHLIGHT]] was perfectly fine, but I wish the Manchego slices on top were somehow sliced a bit thinner. The [[HIGHLIGHT]]deep dish[[ENDHIGHLIGHT]] traditional pizza came out a bit later (remember the 40 min. cooking time, folks!), piping hot and smelling delicious.''' 

Input_Text2 = '''Best tacos I have ever had Lived down the road from this truck for years. Watched almost every episode of BSG eating these tacos with beer. Moved to Az and El Chato is one of the things I miss the most! ANYONE that is around them, you have to go here.''' 

Output_Text2 = '''Best [[HIGHLIGHT]]tacos[[ENDHIGHLIGHT]] I have ever had Lived down the road from this truck for years. Watched almost every episode of BSG eating these [[HIGHLIGHT]]tacos[[ENDHIGHLIGHT]] with beer. Moved to Az and El Chato is one of the things I miss the most!''' 

Query_Not_found = '''Query Not Found''' 


class c_myTest(unittest.TestCase): 
Generator = generate_random_test.TestCaseGenerator() 
KnowOutput = c_Known_Output() 

def testAverageCase1(self): 
    """no keywords present...no highlight""" 
    output = highlight.m_Highlight_doc(self.KnowOutput.Input_Text1, 'deep dish') 
    print "\nTest Case 1" 
    print output 
    self.assertEqual(output, self.KnowOutput.Output_Text1) 

def testAverageCase2(self): 
    output = highlight.m_Highlight_doc(self.KnowOutput.Input_Text2, 'burrito taco take out') 
    print "\nTest Case 2" 
    print output 
    self.assertEqual(output, self.KnowOutput.Output_Text2) 

def testSnippetLength(self): 
    """ if the search word is present only once in the text...check if the snippet is of optimum length...optimum length is defined as one sentence before 
    and after the sentence in which the query word is present""" 
    output = highlight.m_Highlight_doc(self.KnowOutput.Input_Text3, 'tacos') 
    print "\nTest Case 3" 
    print output 
    self.assertEqual(output, self.KnowOutput.Output_Text3) 

def testSmallText(self): 
    """The text is just one sentence, with the query present in it. The same sentence should be the output""" 
    output = highlight.m_Highlight_doc(self.KnowOutput.Input_Text4, 'deep dish pizzas') 
    print "\nTest Case 4" 
    print output 
    self.assertEqual(output, self.KnowOutput.Output_Text4) 

def testBadInput(self): 
    """no keywords present...no highlight""" 
    output = highlight.m_Highlight_doc(self.KnowOutput.Input_Text4, 'tacos') 
    print "\nTest Case 5" 
    print output 
    self.assertEqual(output, self.KnowOutput.Query_Not_found) 

#Now test with randomly generated text 
def testDistantKeywords(self): 
    """the search queries are very distant in the text. 6 query words are generated. First 4 of these queries are inserted in one paragraph and the last two 
    queries are inserted in another. The snippet should be of the first paragraph which has the maximum number of query words present in it.""" 
    query = self.Generator.generateSentence(6, 5) 
    text1 = self.Generator.generateTextwithQuery(query[0:4], 10, 10, 5, 3) 
    text2 = self.Generator.generateTextwithQuery(query[5:], 10, 10, 5, 3) 
    text1.append('\n') 
    text1.extend(text2) 
    print "\nTest Case 6" 
    print "=========================TEXT==================" 
    print ' '.join(text1) 
    print "========================QUERY==================" 
    print ' '.join(query) 
    print " " 
    output_text = highlight.m_Highlight_doc(' '.join(text1), ' '.join(query)) 
    print "=======================SNIPPET=================" 
    print output_text 
    print " " 


if __name__=='__main__': 
    unittest.main() 

显然,我不及格,没有理由给了我,现在我试图找出如果这个代码是ONE。有人可以帮助我确定单元测试中的问题,如果您事先知道代码的输出并且必须为其编写单元测试,应该怎么做。例如,我们可以为随机数生成器编写单元测试吗?

在此先感谢!

+0

在下次面试之前,请务必阅读并练习编写遵循Python风格指南的代码。一位面试官看到上面代码的一段代码很容易就能告诉你以前没有处理过很多python代码。作为一个练习,你可以尝试修正上面的代码,按照这里的建议:http://www.python.org/dev/peps/pep-0008/ – Udi

+0

我明白,我没有按照指导方针,而编写的Python代码(更多的是因为我刚刚开始使用python一周前,这是我的第一个代码!!!另一个错误......应该用Java或C++编码),但我真正有兴趣知道的是我们可以编写单元测试对于输出不为我们所知的函数,比如说为随机数生成器编写单元测试。 –

+0

在Python中,遵循风格和最佳实践非常重要。如果你想用python继续,你应该接受。 – Udi

回答

1

我想说,如果你知道你的代码应该做什么,那么你可以写一个单元测试。在您的测试搜索案例中,我认为可以肯定地说,您可以确定一组给定输入的预期输出。您的面试官可能会遇到与文本匹配和测试相关的问题,而不是您使用的原则。对于随机数生成器,是的,只要您记得计算中只有伪随机数生成器,就可以对其进行测试。你可以测试这个春天的真实,有用的东西是,发生器为同一个种子产生相同的输出,并且该周期不会比你定义的任何时间短。您可能会也可能不会关心给定的种子会产生预先建立的序列。这应该反映在测试套件和文档中。

我的方法是从测试开始,编写代码使其全部通过(请参阅test driven development)。它不仅可以提供良好的测试覆盖率,而且可以在编写代码之前定义代码的功能。