Skip to content

Commit

Permalink
incoming email parsing feature for gmail and yahoo (publiclab#2933)
Browse files Browse the repository at this point in the history
* Added reply to content field in comment table

* Added incoming email parsing feature for gmail and yahooo

* Added incoming email parsing feature for gmail and yahooo

* Minor changes

* Removed column

* Added filter in comment

* Added mail filter in mails

* Minor changes

* Fixes codeclimate issues

* Modified COMMENT_FILTER

* Added test for parsing gmail and yahoo

* Added detail readme for yahoo and gmail parsing
  • Loading branch information
namangupta01 authored and jywarren committed Jul 3, 2018
1 parent a96305b commit 6843e2f
Show file tree
Hide file tree
Showing 16 changed files with 426 additions and 15 deletions.
15 changes: 15 additions & 0 deletions app/helpers/application_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,19 @@ def title_suggestion(comment)
output
end
end

def filtered_comment_body(comment_body)
if contain_trimmed_body?(comment_body)
return comment_body.split(Comment::COMMENT_FILTER).first
end
comment_body
end

def contain_trimmed_body?(comment_body)
comment_body.include?(Comment::COMMENT_FILTER)
end

def trimmed_body(comment_body)
comment_body.split(Comment::COMMENT_FILTER).second
end
end
1 change: 0 additions & 1 deletion app/mailers/comment_mailer.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
class CommentMailer < ActionMailer::Base
helper :application
require 'byebug'
include ApplicationHelper
default from: "notifications@#{ActionMailer::Base.default_url_options[:host]}"

Expand Down
83 changes: 71 additions & 12 deletions app/models/comment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ class Comment < ApplicationRecord
self.table_name = 'comments'
self.primary_key = 'cid'

COMMENT_FILTER = "<!-- @@$$%% Trimmed Content @@$$%% -->".freeze

def self.inheritance_column
'rails_type'
end
Expand Down Expand Up @@ -220,22 +222,79 @@ def user_reactions_map
user_like_map
end

def self.receive_mail(message)
node_id = message.subject[/#([\d]+)/, 1] #This took out the node ID from the subject line
puts node_id
unless node_id.nil?
node = Node.where(nid: node_id)
if node.any?
node = node.first
user = User.find_by(email: message.from.first)
if user.present? && node_id.present?
message_markdown = ReverseMarkdown.convert message.html_part.body.decoded
message_id = message.message_id
comment = node.add_comment(uid: user.uid, body: message_markdown, comment_via: 1, message_id: message_id)
def self.receive_mail(mail)
user = User.where(email: mail.from.first).first
if user
node_id = mail.subject[/#([\d]+)/, 1] #This took out the node ID from the subject line
unless node_id.nil?
node = Node.where(nid: node_id).first
if node
mail_doc = Nokogiri::HTML(mail.html_part.body.decoded) # To parse the mail to extract comment content and reply content
domain = get_domain mail.from.first
if domain == "gmail"
content = gmail_parsed_mail mail_doc
elsif domain == "yahoo"
content = yahoo_parsed_mail mail_doc
else
content = {
"comment_content" => mail_doc,
"extra_content" => nil
}
end

if content["extra_content"].nil?
comment_content_markdown = ReverseMarkdown.convert content["comment_content"]
else
extra_content_markdown = ReverseMarkdown.convert content["extra_content"]
comment_content_markdown = ReverseMarkdown.convert content["comment_content"]
comment_content_markdown = comment_content_markdown + COMMENT_FILTER + extra_content_markdown
end
message_id = mail.message_id
comment = node.add_comment(uid: user.uid, body: comment_content_markdown, comment_via: 1, message_id: message_id)
comment.notify user
end
end
end
end

def self.get_domain(email)
domain = email[/(?<=@)[^.]+(?=\.)/, 0]
end

def self.yahoo_parsed_mail(mail_doc)
if mail_doc.css(".yahoo_quoted")
extra_content = mail_doc.css(".yahoo_quoted")[0]
mail_doc.css(".yahoo_quoted")[0].remove
comment_content = mail_doc
else
comment_content = mail_doc
extra_content = nil
end

{
"comment_content" => comment_content,
"extra_content" => extra_content
}
end

def self.gmail_parsed_mail(mail_doc)
if mail_doc.css(".gmail_quote").any?
extra_content = mail_doc.css(".gmail_quote")[0]
mail_doc.css(".gmail_quote")[0].remove
comment_content = mail_doc
else
comment_content = mail_doc
extra_content = nil
end

{
"comment_content" => comment_content,
"extra_content" => extra_content
}
end

def trimmed_content?
comment.include?(COMMENT_FILTER)
end

end
5 changes: 5 additions & 0 deletions app/models/concerns/comments_shared.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,15 @@
# Refer to this link: http://stackoverflow.com/questions/14541823/how-to-use-concerns-in-rails-4
module CommentsShared
extend ActiveSupport::Concern
include ApplicationHelper

# filtered version additionally appending http/https
# protocol to protocol-relative URLslike "/foo"
def body_email(host = 'publiclab.org')
if contain_trimmed_body?(body)
comment_body = filtered_comment_body(body)
return comment_body.gsub(/([\s|"|'|\[|\(])(\/\/)([\w]?\.?#{host})/, '\1https://\3')
end
body.gsub(/([\s|"|'|\[|\(])(\/\/)([\w]?\.?#{host})/, '\1https://\3')
end

Expand Down
10 changes: 9 additions & 1 deletion app/views/notes/_comment.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,15 @@
</script>

<div style="border: 1px solid #e7e7e7;padding: 18px;" id="c<%= comment.cid %>show">
<p class="comment-body" id="comment-body-<%= comment.cid %>"><%= raw render_comment_body(comment) %></p>

<% comment_body = render_comment_body(comment) %>
<p id="comment-body-<%= comment.cid %>"><%= raw filtered_comment_body(comment_body) %></p>

<% if contain_trimmed_body?(comment_body) %>
<span><a class="email-toggle" data-toggle="collapse" data-target="#comment-<%= comment.cid %>-extra-content">...</a></span>
<div class="collapse" id="comment-<%= comment.cid %>-extra-content" ><%= raw trimmed_body(comment_body) %></div>
<% end %>

<% if comment.body.include?('?') %>
<p class="alert alert-info">Is this a question? <a href="/questions/new?title=<%= comment.body %>">Click here</a> to post it to the <a href="/questions">Questions page</a>.
</p>
Expand Down
14 changes: 14 additions & 0 deletions test/fixtures/drupal_users.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,17 @@ legacy_user:
status: 1
mail: legacy@publiclab.org
created: <%= Time.now.to_i %>

gmail_test:
name: namangupta
mail: 01namangupta@gmail.com
uid: 12
status: 1
created: <%= Time.now.to_i %>

naman18996:
name: naman18996
mail: naman18996@yahoo.com
uid: 13
status: 1
created: <%= Time.now.to_i %>
26 changes: 26 additions & 0 deletions test/fixtures/users.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,3 +132,29 @@ test_user:
bio: ''
created_at: <%= Time.now %>
updated_at: <%= Time.now %>

naman:
username: namangupta
status: 1
email: 01namangupta@gmail.com
id: 12
last_request_at: <%= Time.now %>
password_salt: <%= salt = Authlogic::Random.hex_token %>
crypted_password: <%= Authlogic::CryptoProviders::Sha512.encrypt("secretive" + salt) %>
persistence_token: <%= Authlogic::Random.hex_token %>
bio: ''
created_at: <%= Time.now %>
updated_at: <%= Time.now %>

naman18996:
username: naman18996
status: 1
email: naman18996@yahoo.com
id: 13
last_request_at: <%= Time.now %>
password_salt: <%= salt = Authlogic::Random.hex_token %>
crypted_password: <%= Authlogic::CryptoProviders::Sha512.encrypt("secretive" + salt) %>
persistence_token: <%= Authlogic::Random.hex_token %>
bio: ''
created_at: <%= Time.now %>
updated_at: <%= Time.now %>
11 changes: 11 additions & 0 deletions test/incoming_test_emails/gmail/final_parsed_comment.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
This is another reply by email comment

<!-- @@$$%% Trimmed Content @@$$%% -->On Tue, Jul 3, 2018 at 11:17 PM Naman Gupta \<[01namangupta@gmail.com](mailto:01namangupta@gmail.com)\> wrote:

> This is reply by comment
>
>
> On Tue, Jul 3, 2018 at 11:13 PM Rails Projects \<[railsprojects2018@gmail.com](mailto:railsprojects2018@gmail.com)\> wrote:
>
> > This is a comment sent to the user by publiclab.

107 changes: 107 additions & 0 deletions test/incoming_test_emails/gmail/incoming_gmail_email.eml
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
Delivered-To: railsprojects2018@gmail.com
Received: by 2002:a1c:924f:0:0:0:0:0 with SMTP id u76-v6csp1316358wmd;
Tue, 3 Jul 2018 10:48:13 -0700 (PDT)
X-Received: by 2002:a50:e615:: with SMTP id y21-v6mr29798400edm.278.1530640093065;
Tue, 03 Jul 2018 10:48:13 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1530640093; cv=none;
d=google.com; s=arc-20160816;
b=iwlu7+0DRR8CFpRnB6GQ1MSWcjrLdSomGi3xKTijlKdmBbwQrxmMpX6SNnzskE3xVC
SiylMtj5yTcxWmWTuKuOKLbprkSobBQ1vu+xRV1J7S28a/1nYOKEipCQKh3bgLl7lIqr
vRCnNZqt+afuol0O+97HmZAp3yYBmUSL6ArW+G5mId1Zxc25uV037xJeA/mKyr7FccW9
D6W86z9XjMzlBUWB2k3V0vbsVJhBgxXxq22VJWx2ZmK4fTiyNsBqqjfjLyGFGdMVy/e6
9jvl66h3XmhJp49kWbWSlFk1RKGTXvDZJ4ZhwnBZ6wqUYtrXmwg85ibmdXxDCvYANvkc
JUbw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:dkim-signature:arc-authentication-results;
bh=RYGhy7yx79U1IqJDo1X6kis0HCj0a5eCbuu3JzStGEU=;
b=a3X4+9aFlZEKknmilS9b18oXGSTJxvMhh/nIqhFIL1fxYIdE7XCXY+TTLwrjTkWIo5
XaghwYmv8vCedU2v0KfojMGaN6fkZXqXr+lcsYpRJqSUALoaYhJlAOl/bL54PzXewLZ6
5xrraJMyDdo/xBqd6P12Cndza5il6/kpiSHPdoGb/4F/zQOQ15ta0n8veXAGFTk3k4Yk
T9MrzB0ogcaFDYlJtPFRCHz4YenzAN3JZ0hmxwLnO7f0DfGvCFi35fAOFpxCCCzd4KzG
vhMcuS/LPCQNPUbN8NGRfSHAULgGtFNGL7PmG6FKFY+iPD10AL/zIFOlHoiLsJucgNE+
hmpw==
ARC-Authentication-Results: i=1; mx.google.com;
dkim=pass header.i=@gmail.com header.s=20161025 header.b=plagx6JT;
spf=pass (google.com: domain of 01namangupta@gmail.com designates 209.85.220.41 as permitted sender) smtp.mailfrom=01namangupta@gmail.com;
dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Return-Path: <01namangupta@gmail.com>
Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41])
by mx.google.com with SMTPS id k17-v6sor1127557edr.39.2018.07.03.10.48.13
for <railsprojects2018@gmail.com>
(Google Transport Security);
Tue, 03 Jul 2018 10:48:13 -0700 (PDT)
Received-SPF: pass (google.com: domain of 01namangupta@gmail.com designates 209.85.220.41 as permitted sender) client-ip=209.85.220.41;
Authentication-Results: mx.google.com;
dkim=pass header.i=@gmail.com header.s=20161025 header.b=plagx6JT;
spf=pass (google.com: domain of 01namangupta@gmail.com designates 209.85.220.41 as permitted sender) smtp.mailfrom=01namangupta@gmail.com;
dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20161025;
h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
bh=RYGhy7yx79U1IqJDo1X6kis0HCj0a5eCbuu3JzStGEU=;
b=plagx6JTlgloc5Mt7ji72zSoeeF46aFIv2TCnUbBW/qcTFIMyAiIgebAf1cR8YnkEx
8taYLU3XfDwJMpSsqnyWxANkc7QqUQUSguvA9fVWq2HlUvULzPjhagsUyC28i2U+4P/V
lCu5YNJJD3skx5yCKQ8SgtVgJTQEYcJYMQII9lfdhOcgvN54NuttBEi8YhiTnFvelhcF
TvrQO/S2I8/djSLXVT9SId1UA9gSEdGScTnCS5j1+F7RQVjyE9wHnypAmv+oL4v2HhGD
YbCPjieeOnRYH80s9GQuY/S+eRaDp2M+hKoDo3wy7pkxqGgPN1LWm7by5AySjJyVhd0f
QtRQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to;
bh=RYGhy7yx79U1IqJDo1X6kis0HCj0a5eCbuu3JzStGEU=;
b=pZSQnj3qWHgBs4OCiHQ0VXZ/Iovfx3KJym43xyzq2xi/Mp8GvsLNmR5XV7k3MLll2Z
WPb3MlLXHW8K04wESSWvHs5Wg85hjpphaUdGbqicxA61W2/vBZpdRL02p/wbuQ3Pu3HA
M8gASdw8PNJIXZNg053R7gxGGKnR3fW3fx1SbZCi88L1bT+Qihtx6gcI3Eq6gs6xSUa5
p2V9JOlKI9CldZLH2ktxBSxfZJoMHCeDJBN720VRCkOpGH/Kmf2LRFuyCVSC7y4U9Ly9
RmQchyy65WQqZKXZ6U+CCQZQ1FfSWjvjTCt0JfAkt/rJPgDzc6gzLD/vOgQAK3MD5R98
W3DA==
X-Gm-Message-State: APt69E0SpKbIzAW22w+TY0xcx9Frrfnhb5xRiuXK430nqF2cSWvL1qtb vBWot7Yt6WZlROATEbXleNghfNTqUfcasRA12stohoZE
X-Google-Smtp-Source: AAOMgpf0oHAY3q9Mx0qWa/782HzUaDAgN/Ccf/jg0W9zN87UGy/WmbrLh2TARouOGNHrMLvcRb52bY0k8emcXvSLdBo=
X-Received: by 2002:a50:9493:: with SMTP id s19-v6mr29606779eda.285.1530640092632; Tue, 03 Jul 2018 10:48:12 -0700 (PDT)
MIME-Version: 1.0
References: <CAN4wNrnDfK_P4UXVMa7yp_YGJwOvH3K_vMUq-aQKAGQQkysPZQ@mail.gmail.com> <CAA3L2u4NZL9kKGCag8K8kc8Qr6XCvH=CwejpHshfpDHOzc_OhQ@mail.gmail.com>
In-Reply-To: <CAA3L2u4NZL9kKGCag8K8kc8Qr6XCvH=CwejpHshfpDHOzc_OhQ@mail.gmail.com>
From: Naman Gupta <01namangupta@gmail.com>
Date: Tue, 3 Jul 2018 23:18:01 +0530
Message-ID: <CAA3L2u5PMJ80Z--xTDo5Bow6xVWy=DU9UzFHffD6tmuk+ynu8w@mail.gmail.com>
Subject: Re: New Comment on Note (#21)
To: railsprojects2018@gmail.com
Content-Type: multipart/alternative; boundary="000000000000001a4f05701beb69"

--000000000000001a4f05701beb69
Content-Type: text/plain; charset="UTF-8"
This is another reply by email comment
On Tue, Jul 3, 2018 at 11:17 PM Naman Gupta <01namangupta@gmail.com> wrote:
> This is reply by comment
>
> On Tue, Jul 3, 2018 at 11:13 PM Rails Projects <
> railsprojects2018@gmail.com> wrote:
>
>> This is a comment sent to the user by publiclab.
>>
>
--000000000000001a4f05701beb69
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">This is another reply by email comment</div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr">On Tue, Jul 3, 2018 at 11:17 PM Naman Gup=
ta &lt;<a href=3D"mailto:01namangupta@gmail.com">01namangupta@gmail.com</a>=
&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 =
0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">This i=
s reply by comment</div><br><div class=3D"gmail_quote"><div dir=3D"ltr">On =
Tue, Jul 3, 2018 at 11:13 PM Rails Projects &lt;<a href=3D"mailto:railsproj=
ects2018@gmail.com" target=3D"_blank">railsprojects2018@gmail.com</a>&gt; w=
rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">This is a co=
mment sent to the user by publiclab.</div>
</blockquote></div>
</blockquote></div>

--000000000000001a4f05701beb69--
15 changes: 15 additions & 0 deletions test/incoming_test_emails/gmail/incoming_gmail_email.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
<div dir="ltr">This is another reply by email comment</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Tue, Jul 3, 2018 at 11:17 PM Naman Gupta &lt;<a href="mailto:01namangupta@gmail.com">01namangupta@gmail.com</a>&gt; wrote:<br></div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">This is reply by comment</div>
<br>
<div class="gmail_quote">
<div dir="ltr">On Tue, Jul 3, 2018 at 11:13 PM Rails Projects &lt;<a href="mailto:railsprojects2018@gmail.com" target="_blank">railsprojects2018@gmail.com</a>&gt; wrote:<br></div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">This is a comment sent to the user by publiclab.</div>
</blockquote>
</div>
</blockquote>
</div>
9 changes: 9 additions & 0 deletions test/incoming_test_emails/gmail/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
This documents the various steps done to parse the mails.

1. `incoming_gmail_email.eml` is a sample incoming mail form gmail.
2. `incoming_gmail_email.html` file containss `incoming_gmail_email.eml` converted to html
3. Finally `final_parsed_comment.txt` contains the final parsed comment.

To parse the mails coming form gmail and to seperate the main body content and trimmed content which contains conversation thread information we have used a class `gmail_quote` which seperates the main body content and trimmed content.

In `incoming_gmail_email.html` there is a class named `gmail_quote`, html element containing this class is the trimmed content. So to remove this we have used Nokogiri to parse html using which we have seperated the main body content and trimmed content based on class selector.
10 changes: 10 additions & 0 deletions test/incoming_test_emails/yahoo/final_parsed_comment.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
This is reply by mail comment from yahoooo

<!-- @@$$%% Trimmed Content @@$$%% --> On Tuesday, 3 July 2018, 11:20:57 PM IST, Rails Projects \<railsprojects2018@gmail.com\> wrote:





This is a comment sent to the user by publiclab.

Loading

0 comments on commit 6843e2f

Please sign in to comment.