2016-06-07 70 views
0

我使用PRAW与Python,我希望能够:充分利用自身的文本链接PRAW版(Subreddit)对象的版(Subreddit)

  1. 通过“新”的帖子上一版(Subreddit)去
  2. 检测如果有一个链接到一个subreddit帖子自我文字
  3. 如果有一个subreddit链接,获取该subreddit作为一个PRAW对象,将在稍后使用。

我可以做第1步,但发现是否有一个subreddit链接,然后得到该subreddit是我的困难部分。下面是我到目前为止有:

#! python3 
# Reply with subreddit info from subreddit in text body 

import praw, time 

# Bot login details 
USERNAME = "AutoMobBot"; 
PASSWORD = "<redacted>"; 

UA = "[Subreddit Info Provider (Update 0) by /u/MatthewMob]"; 
r = praw.Reddit(UA); 
r.login(USERNAME, PASSWORD, disable_warning=True); 

submissions = r.get_subreddit("matthewmob_csstesting").get_new(limit=10); 

for submission in submissions: 
    for word in submission.selftext.lower().split(): 
     if word.startswith("/r/"): 
      print("Found subreddit in:", submission.title); 
      print(submission.selftext_html); 

print("Done..."); 
input(); 

这将刚刚得到的意见,在分裂的selftext的话,并打印出的东西,如果分割的话之一,/r/开始,显然这是行不通的所有的时间,如果用户,例如,只链接subreddit作为r/askredditwww.reddit.com/r/askreddit。即使如此,如果他们将/r/askreddit/top(与最后的东西)联系起来,我将如何能够将该子reddit作为PRAW对象?我一直在试图找到一些正则表达式来帮助我做到这一点,但没有找到它。

我的主要问题是什么是从用户selftext中的链接获得subreddit的最佳方法,以及我该怎么做?

如果您需要更多说明,我很乐意提供更多信息。

回答

0

我现在找到了我自己的答案。这里是适用于我的代码:

#! python3 
# Reply with subreddit info from subreddit in text body 

import praw, bs4, re 
from pprint import pprint 

# Bot login details 
USERNAME = "AutoMobBot"; 
PASSWORD = "<Password>"; 

UA = "[Subreddit Info Provider (Update 4) by /u/MatthewMob]"; 
r = praw.Reddit(UA); 
r.login(USERNAME, PASSWORD, disable_warning=True); 

submissions = r.get_subreddit("matthewmob_csstesting").get_new(limit=3); 

for submission in submissions: 
    subs = []; 
    subsfound = -1; 
    soup = bs4.BeautifulSoup(submission.selftext_html, "lxml"); 
    for a in soup.find_all("a", href=True): 
     href = a["href"] + "/"; 
     getsub = re.findall("\/r\/(.*?)\/", href, re.DOTALL); 
     if getsub != None: 
      if getsub[subsfound] not in subs: 
       subs.append(getsub[subsfound]); 
       subsfound = subsfound + 1; 
       print("\nTitle:", submission.title); 
       print("\nSubreddits Found:", subsfound); 
       print("\nSubreddit Found:", subs[subsfound] + "\n"); 

print("Done..."); 
input();