Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge from grarak #1

Merged
merged 189 commits into from
Sep 30, 2015
Merged

merge from grarak #1

merged 189 commits into from
Sep 30, 2015

Conversation

Martinusbe
Copy link

No description provided.

imoseyon and others added 30 commits September 13, 2015 10:01
Conflicts:
	drivers/video/msm/mdss/Kconfig
	drivers/video/msm/mdss/Makefile
	drivers/video/msm/mdss/mdss_mdp_pp.c

Conflicts:
	drivers/video/msm/mdss/Kconfig
 - Unify enable/update routines
 - Remove no longer used update call from probe function
 - Pass lut_data struct to all kcal/mdss functions
 - Thank you to cyanogen for PCC code reference
 * MDSS5 supports Polynomial Color Correction. Use this to implement
   a simple sysfs API for adjusting RGB scaling values. This can be
   used to implement color temperature and other controls.
 * Why use this when we have KCAL? This code is dead simple, the
   interface is in the right place, and it allows for 128X accuracy.

Change-Id: Ib848d6c9dbdf41f61cc7539a61138d6632ceba94

Conflicts:
	drivers/video/msm/mdss/mdss_fb.c

Conflicts:
	drivers/video/msm/mdss/mdss_fb.c
 * DSP supports this swimmingly.

Change-Id: I11696dd49253c5a310b32c0bfe1a878009597b24

Conflicts:
	arch/arm/mach-msm/qdsp6v2/aac_in.c
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When a key is being garbage collected, it's key->user would get put before
the ->destroy() callback is called, where the key is removed from it's
respective tracking structures.

This leaves a key hanging in a semi-invalid state which leaves a window open
for a different task to try an access key->user. An example is
find_keyring_by_name() which would dereference key->user for a key that is
in the process of being garbage collected (where key->user was freed but
->destroy() wasn't called yet - so it's still present in the linked list).

This would cause either a panic, or corrupt memory.

Fixes CVE-2014-9529.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Not caching dst_entries which cause redirects could be exploited by hosts
on the same subnet, causing a severe DoS attack. This effect aggravated
since commit f886497 ("ipv4: fix dst race in sk_dst_get()").

Lookups causing redirects will be allocated with DST_NOCACHE set which
will force dst_release to free them via RCU.  Unfortunately waiting for
RCU grace period just takes too long, we can end up with >1M dst_entries
waiting to be released and the system will run OOM. rcuos threads cannot
catch up under high softirq load.

Attaching the flag to emit a redirect later on to the specific skb allows
us to cache those dst_entries thus reducing the pressure on allocation
and deallocation.

This issue was discovered by Marcelo Leitner.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Chernenkov used KASAN to discover that eCryptfs writes past the
end of the allocated buffer during encrypted filename decoding. This
fix corrects the issue by getting rid of the unnecessary 0 write when
the current bit offset is 2.

Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Reported-by: Dmitry Chernenkov <dmitryc@google.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # v2.6.29+: 51ca58d eCryptfs: Filename Encryption: Encoding and encryption functions
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

 [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Kenton Varda <kenton@sandstorm.io> discovered that by remounting a
read-only bind mount read-only in a user namespace the
MNT_LOCK_READONLY bit would be cleared, allowing an unprivileged user
to the remount a read-only mount read-write.

Correct this by replacing the mask of mount flags to preserve
with a mask of mount flags that may be changed, and preserve
all others.   This ensures that any future bugs with this mask and
remount will fail in an easy to detect way where new mount flags
simply won't change.

Cc: stable@vger.kernel.org
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Conflicts:
	fs/namespace.c
Andy Lutomirski recently demonstrated that when chroot is used to set
the root path below the path for the new ``root'' passed to pivot_root
the pivot_root system call succeeds and leaks mounts.

In examining the code I see that starting with a new root that is
below the current root in the mount tree will result in a loop in the
mount tree after the mounts are detached and then reattached to one
another.  Resulting in all kinds of ugliness including a leak of that
mounts involved in the leak of the mount loop.

Prevent this problem by ensuring that the new mount is reachable from
the current root of the mount tree.

[Added stable cc.  Fixes CVE-2014-7970.  --Andy]

Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/87bnpmihks.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>

Conflicts:
	fs/namespace.c

Conflicts:
	fs/namespace.c
We used to read file_handle twice. Once to get the amount of extra bytes, and
once to fetch the entire structure.

This may be problematic since we do size verifications only after the first
read, so if the number of extra bytes changes in userspace between the first
and second calls, we'll have an incoherent view of file_handle.

Instead, read the constant size once, and copy that over to the final
structure without having to re-read it again.

Change-Id: Ib05e5129629e27d5a05953098c5bc470fae40d2a
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Sasha Levin found a NULL pointer dereference that is due to a missing
page table lock, which in turn is due to the pmd entry in question being
a transparent huge-table entry.

The code - introduced in commit 1998cc0 ("mm: make
madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks
for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it
turns out that that function doesn't work correctly.

pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would
trigger if the transparent hugepage bit was set, but it doesn't do that
if pmd_numa() is also set. Note that the NUMA bit only gets set on real
NUMA machines, so people trying to reproduce this on most normal
development systems would never actually trigger this.

Fix it by removing the very subtle (and subtly incorrect) expectation,
and instead just checking pmd_trans_huge() explicitly.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
[ Additionally remove the now stale test for pmd_trans_huge() inside the
  pmd_bad() case - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A local route may have a lower hop_limit set than global routes do.

RFC 3756, Section 4.2.7, "Parameter Spoofing"

>   1.  The attacker includes a Current Hop Limit of one or another small
>       number which the attacker knows will cause legitimate packets to
>       be dropped before they reach their destination.

>   As an example, one possible approach to mitigate this threat is to
>   ignore very small hop limits.  The nodes could implement a
>   configurable minimum hop limit, and ignore attempts to set it below
>   said limit.

Signed-off-by: D.S. Ljungmark <ljungmark@modio.se>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timeout entries are sizeof(int) rather than sizeof(long), which
means that when they were getting read we'd also leak kernel memory
to userspace along with the timeout values.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes.  Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.

This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).

The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.

For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.

Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set).  A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.

Also: add an entry in ip-sysctl.txt for optimistic_dad.

[cherry-pick of net-next 7fd2561]

Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug: 17769720
Bug: 18180674
Change-Id: Ic7e50781c607e1f3a492d9ce7395946efb95c533

Conflicts:
	include/uapi/linux/ipv6.h
Other drivers write to these regs (KCAL, Sony) and other developers may
implement more than one driver. Make sure we are always reporting the correct
PCC values.

Change-Id: Id4a28602d6678d8032f1328c49163b52c15d52b1
Patch that allows for call recording in
stereo (left channel: TX; right channel:
RX). Tested on Nexus 5 (MSM8974). Likely
works on other recent MSM chipsets as
well.

Change-Id: I934c252283d1346f9c1f6cf786ce2f661f20a049

Conflicts:
	sound/soc/msm/qdsp6v2/msm-pcm-voice-v2.c
We have two problems in UDP stack related to bogus checksums :

1) We return -EAGAIN to application even if receive queue is not empty.
   This breaks applications using edge trigger epoll()

2) Under UDP flood, we can loop forever without yielding to other
   processes, potentially hanging the host, especially on non SMP.

This patch is an attempt to make things better.

We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Conflicts:
	net/ipv4/udp.c

Conflicts:
	net/ipv4/udp.c
commit 8b01fc8 upstream.

This prevents a race between chown() and execve(), where chowning a
setuid-user binary to root would momentarily make the binary setuid
root.

This patch was mostly written by Linus Torvalds.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Charles Williams <ciwillia@brocade.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
FIOPS (Fair IOPS) ioscheduler is IOPS based ioscheduler, so only targets
for drive without I/O seek. It's quite similar like CFQ, but the dispatch
decision is made according to IOPS instead of slice.

The algorithm is simple. Drive has a service tree, and each task lives in
the tree. The key into the tree is called vios (virtual I/O). Every request
has vios, which is calculated according to its ioprio, request size and so
on. Task's vios is the sum of vios of all requests it dispatches. FIOPS
always selects task with minimum vios in the service tree and let the task
dispatch request. The dispatched request's vios is then added to the task's
vios and the task is repositioned in the sevice tree.

Unlike CFQ, FIOPS doesn't have separate sync/async queues, because with I/O
less writeback, usually a task can only dispatch either sync or async requests.
Bias read or write request can still be done with read/write scale.

One issue is if workload iodepth is lower than drive queue_depth, IOPS
share of a task might not be strictly according to its priority, request
Bias read or write request can still be done with read/write scale.

One issue is if workload iodepth is lower than drive queue_depth, IOPS
share of a task might not be strictly according to its priority, request
size and so on. In this case, the drive is in idle actually. Solving the
problem need make drive idle, so impact performance. I believe CFQ isn't
completely fair between tasks in such case too.

Change-Id: I1f86b964ada1e06ac979899ca05f1082d0d8228d
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
read/write speed of Flash based storage usually is different. For example,
in my SSD maxium thoughput of read is about 3 times faster than that of
write. Add a scale to differenate read and write. Also add a tunable, so
user can assign different scale for read and write.

By default, the scale is 1:1, which means the scale is a noop.

Change-Id: Ic223e96d1c72591ef535307755d78ff33dbc6939
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
CFQ gives 2.5 times more share to sync workload. This matches CFQ.

Note this is different with the read/write scale. We have 3 types of
requests:
1. read
2. sync write
3. write
CFQ doesn't differentitate type 1 and 2, but request cost of 1 and 2
are usually different for flash based storage. So we have both sync/async
and read/write scale here.

Change-Id: I3b36c94ba63df6d7a823c941a34a479da6243f20
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Add CFQ-like ioprio support. Priority A will get 20% more share than priority
A+1, which matches CFQ.

Change-Id: I0d6f145810e3f0979440063c030cddf30ad4179c
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
If the task has running request, even it's added into service tree newly,
we preserve its vios key, so it will not lost its share. This should work
for task driving big queue depth. For single depth task, there is no approach
to preserve its vios key.

Change-Id: I40bdaff6430b783b965ca434ffc46b7205b554cd
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Martinusbe added a commit that referenced this pull request Sep 30, 2015
@Martinusbe Martinusbe merged commit 1bd6bab into Validus-Kernel:lp5.1 Sep 30, 2015
Martinusbe pushed a commit that referenced this pull request Oct 2, 2015
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Martinusbe pushed a commit that referenced this pull request Oct 12, 2015
While accessing cur_policy during executing events
CPUFREQ_GOV_START, CPUFREQ_GOV_STOP, CPUFREQ_GOV_LIMITS,
same mutex lock is not taken, dbs_data->mutex, which leads
to race and data corruption while running continious suspend
resume test. This is seen with ondemand governor with suspend
resume test using rtcwake.

 Unable to handle kernel NULL pointer dereference at virtual address 00000028
 pgd = ed610000
 [00000028] *pgd=adf11831, *pte=00000000, *ppte=00000000
 Internal error: Oops: 17 [#1] PREEMPT SMP ARM
 Modules linked in: nvhost_vi
 CPU: 1 PID: 3243 Comm: rtcwake Not tainted 3.10.24-gf5cf9e5 #1
 task: ee708040 ti: ed61c000 task.ti: ed61c000
 PC is at cpufreq_governor_dbs+0x400/0x634
 LR is at cpufreq_governor_dbs+0x3f8/0x634
 pc : [<c05652b8>] lr : [<c05652b0>] psr: 600f0013
 sp : ed61dcb0 ip : 000493e0 fp : c1cc14f0
 r10: 00000000 r9 : 00000000 r8 : 00000000
 r7 : eb725280 r6 : c1cc1560 r5 : eb575200 r4 : ebad7740
 r3 : ee708040 r2 : ed61dca8 r1 : 001ebd24 r0 : 00000000
 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
 Control: 10c5387d Table: ad61006a DAC: 00000015
 [<c05652b8>] (cpufreq_governor_dbs+0x400/0x634) from [<c055f700>] (__cpufreq_governor+0x98/0x1b4)
 [<c055f700>] (__cpufreq_governor+0x98/0x1b4) from [<c0560770>] (__cpufreq_set_policy+0x250/0x320)
 [<c0560770>] (__cpufreq_set_policy+0x250/0x320) from [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168)
 [<c0561dcc>] (cpufreq_update_policy+0xcc/0x168) from [<c0561ed0>] (cpu_freq_notify+0x68/0xdc)
 [<c0561ed0>] (cpu_freq_notify+0x68/0xdc) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c)
 [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68)
 [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28)
 [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310)
 [<c00aac6c>] (pm_qos_update_bounded_target+0xd8/0x310) from [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70)
 [<c00ab3b0>] (__pm_qos_update_request+0x64/0x70) from [<c004b4b8>] (tegra_pm_notify+0x114/0x134)
 [<c004b4b8>] (tegra_pm_notify+0x114/0x134) from [<c008eff8>] (notifier_call_chain+0x4c/0x8c)
 [<c008eff8>] (notifier_call_chain+0x4c/0x8c) from [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68)
 [<c008f3d4>] (__blocking_notifier_call_chain+0x50/0x68) from [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28)
 [<c008f40c>] (blocking_notifier_call_chain+0x20/0x28) from [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34)
 [<c00ac228>] (pm_notifier_call_chain+0x1c/0x34) from [<c00ad38c>] (enter_state+0xec/0x128)
 [<c00ad38c>] (enter_state+0xec/0x128) from [<c00ad400>] (pm_suspend+0x38/0xa4)
 [<c00ad400>] (pm_suspend+0x38/0xa4) from [<c00ac114>] (state_store+0x70/0xc0)
 [<c00ac114>] (state_store+0x70/0xc0) from [<c027b1e8>] (kobj_attr_store+0x14/0x20)
 [<c027b1e8>] (kobj_attr_store+0x14/0x20) from [<c019cd9c>] (sysfs_write_file+0x104/0x184)
 [<c019cd9c>] (sysfs_write_file+0x104/0x184) from [<c0143038>] (vfs_write+0xd0/0x19c)
 [<c0143038>] (vfs_write+0xd0/0x19c) from [<c0143414>] (SyS_write+0x4c/0x78)
 [<c0143414>] (SyS_write+0x4c/0x78) from [<c000f080>] (ret_fast_syscall+0x0/0x30)
 Code: e1a00006 eb084346 e59b0020 e5951024 (e5903028)
 ---[ end trace 0488523c8f6b0f9d ]---

Signed-off-by: Bibek Basu <bbasu@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.11+ <stable@vger.kernel.org> # 3.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
Martinusbe pushed a commit that referenced this pull request Dec 27, 2015
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Martinusbe pushed a commit that referenced this pull request Dec 27, 2015
The __f2fs_add_link is covered by cp_rwsem all the time.
This calls init_inode_metadata, which conducts some acl operations including
memory allocation with GFP_KERNEL previously.
But, under memory pressure, f2fs_write_data_page can be called, which also
grabs cp_rwsem too.

In this case, this incurs a deadlock pointed by Chao.
Thread #1        Thread #2
 down_read
                 down_write
  down_read
 -> here down_read should wait forever.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Dec 27, 2015
This patch doesn't make any effect on previous behavior, since
f2fs_write_data_page bypasses writing the page during POR.

But, the difference is that this patch avoids holding writepages mutex.
This is to avoid the following false warning, since this can happen only
when mount and shutdown are triggered at the same time.

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.0.0-rc1+ #3 Tainted: G           O
 -------------------------------------------------------
 kworker/u8:0/2270 is trying to acquire lock:
  (&sbi->gc_mutex){+.+.+.}, at: [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]

 but task is already holding lock:
  (&sbi->writepages){+.+...}, at: [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (&sbi->writepages){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126e23a>] writeback_single_inode+0xea/0x1c0
        [<ffffffff8126e425>] write_inode_now+0x95/0xa0
        [<ffffffff81259dab>] iput+0x20b/0x3f0
        [<ffffffffa02c1c8b>] recover_data.constprop.14+0x26b/0xa80 [f2fs]
        [<ffffffffa02c2776>] recover_fsync_data+0x2b6/0x5e0 [f2fs]
        [<ffffffffa02a9744>] f2fs_fill_super+0xb24/0xb90 [f2fs]
        [<ffffffff8123d7f4>] mount_bdev+0x1a4/0x1e0
        [<ffffffffa02a3c85>] f2fs_mount+0x15/0x20 [f2fs]
        [<ffffffff8123e159>] mount_fs+0x39/0x180
        [<ffffffff8125e51b>] vfs_kern_mount+0x6b/0x160
        [<ffffffff81261554>] do_mount+0x204/0xbe0
        [<ffffffff8126223b>] SyS_mount+0x8b/0xe0
        [<ffffffff81863e6d>] system_call_fastpath+0x16/0x1b

 -> #1 (&sbi->cp_mutex){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02acbf2>] write_checkpoint+0x42/0x1230 [f2fs]
        [<ffffffffa02a847d>] f2fs_sync_fs+0x9d/0x2a0 [f2fs]
        [<ffffffff81272f82>] sync_filesystem+0x82/0xb0
        [<ffffffff8123c214>] generic_shutdown_super+0x34/0x100
        [<ffffffff8123c5f7>] kill_block_super+0x27/0x70
        [<ffffffffa02a3c60>] kill_f2fs_super+0x20/0x30 [f2fs]
        [<ffffffff8123ca49>] deactivate_locked_super+0x49/0x80
        [<ffffffff8123d05e>] deactivate_super+0x4e/0x70
        [<ffffffff8125df63>] cleanup_mnt+0x43/0x90
        [<ffffffff8125e002>] __cleanup_mnt+0x12/0x20
        [<ffffffff810a82e4>] task_work_run+0xc4/0xf0
        [<ffffffff8101f0bd>] do_notify_resume+0x8d/0xa0
        [<ffffffff81864141>] int_signal+0x12/0x17

 -> #0 (&sbi->gc_mutex){+.+.+.}:
        [<ffffffff810e2866>] __lock_acquire+0x1ac6/0x1c90
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]
        [<ffffffffa02b5938>] f2fs_write_data_page+0x348/0x5b0 [f2fs]
        [<ffffffffa02af9da>] __f2fs_writepage+0x1a/0x50 [f2fs]
        [<ffffffff811c1b54>] write_cache_pages+0x274/0x6f0
        [<ffffffffa02b2630>] f2fs_write_data_pages+0xe0/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126d44a>] writeback_sb_inodes+0x32a/0x710
        [<ffffffff8126d8cf>] __writeback_inodes_wb+0x9f/0xd0
        [<ffffffff8126dcdb>] wb_writeback+0x3db/0x850
        [<ffffffff8126e848>] bdi_writeback_workfn+0x148/0x980
        [<ffffffff810a3782>] process_one_work+0x1e2/0x840
        [<ffffffff810a3f01>] worker_thread+0x121/0x460
        [<ffffffff810a9dc8>] kthread+0xf8/0x110
        [<ffffffff81863dbc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Dec 27, 2015
We will encounter oops by executing below command.
getfattr -n system.advise /mnt/f2fs/file
Killed

message log:
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs]
*pdpt = 00000000319b7001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev
snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport
binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
CPU: 3 PID: 3134 Comm: getfattr Tainted: G           O    4.0.0-rc1 #6
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: f3a71b60 ti: f19a6000 task.ti: f19a6000
EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3
EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs]
EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467
ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0
Stack:
 f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71
 f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000
 c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973
Call Trace:
 [<c1193850>] generic_getxattr+0x40/0x50
 [<c1193810>] ? xattr_resolve_name+0x80/0x80
 [<c1193c00>] vfs_getxattr+0x70/0xa0
 [<c1194097>] getxattr+0x87/0x190
 [<c11801d7>] ? path_lookupat+0x57/0x5f0
 [<c11819d2>] ? putname+0x32/0x50
 [<c116653a>] ? kmem_cache_alloc+0x2a/0x130
 [<c11819d2>] ? putname+0x32/0x50
 [<c11819d2>] ? putname+0x32/0x50
 [<c11819d2>] ? putname+0x32/0x50
 [<c11827f9>] ? user_path_at_empty+0x49/0x70
 [<c118283f>] ? user_path_at+0x1f/0x30
 [<c11941e7>] path_getxattr+0x47/0x80
 [<c11948e7>] SyS_getxattr+0x27/0x30
 [<c163f748>] sysenter_do_call+0x12/0x12
Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00
00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb
EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08
CR2: 0000000000000000
---[ end trace 860260654f1f416a ]---

The reason is that in getfattr there are two steps which is indicated by strace info:
1) try to lookup and get size of specified xattr.
2) get value of the extented attribute.

strace info:
getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1
getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1

For the first step, getfattr may pass a NULL pointer in @value and zero in @SiZe
as parameters for ->getxattr, but we access this @value pointer directly without
checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the
oops occurs.

This patch fixes this issue by verifying @value pointer before using.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Dec 27, 2015
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit 32a71bc)
Git-commit: 41ba062
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Martinusbe pushed a commit that referenced this pull request Jan 24, 2016
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Martinusbe pushed a commit that referenced this pull request Jan 24, 2016
If a user key gets negatively instantiated, an error code is cached in the
payload area.  A negatively instantiated key may be then be positively
instantiated by updating it with valid data.  However, the ->update key
type method must be aware that the error code may be there.

The following may be used to trigger the bug in the user key type:

    keyctl request2 user user "" @U
    keyctl add user user "a" @U

which manifests itself as:

	BUG: unable to handle kernel paging request at 00000000ffffff8a
	IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	PGD 7cc30067 PUD 0
	Oops: 0002 [#1] SMP
	Modules linked in:
	CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
	task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000
	RIP: 0010:[<ffffffff810a376f>]  [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280
	 [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	RSP: 0018:ffff88003dd8bdb0  EFLAGS: 00010246
	RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001
	RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82
	RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000
	R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82
	R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700
	FS:  0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
	CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0
	Stack:
	 ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82
	 ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5
	 ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620
	Call Trace:
	 [<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
	 [<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129
	 [<     inline     >] __key_update security/keys/key.c:730
	 [<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908
	 [<     inline     >] SYSC_add_key security/keys/keyctl.c:125
	 [<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
	 [<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185

Note the error code (-ENOKEY) in EDX.

A similar bug can be tripped by:

    keyctl request2 trusted user "" @U
    keyctl add trusted user "a" @U

This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.

Change-Id: I6452091ee01e324c934cf0c7d5a0f3ffa5d0b001
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Martinusbe pushed a commit that referenced this pull request Feb 1, 2016
commit b0dc3a3 upstream.

Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
stress condition:

  BUG: unable to handle kernel paging request at 0000000000025f60
  IP: next_online_pgdat+0x1/0x50
  PGD 0
  Oops: 0000 [#1] SMP
  ACPI: Device does not support D3cold
  Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
  CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G           O 3.10.15-5885-euler0302 #1
  Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
  Workqueue: events vmstat_update
  task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
  RIP: 0010: next_online_pgdat+0x1/0x50
  RSP: 0018:ffffa800d32afce8  EFLAGS: 00010286
  RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
  RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
  RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
  R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
  R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
  FS:  0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
    refresh_cpu_vm_stats+0xd0/0x140
    vmstat_update+0x11/0x50
    process_one_work+0x194/0x3d0
    worker_thread+0x12b/0x410
    kthread+0xc6/0xd0
    ret_from_fork+0x7c/0xb0

The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
try_offline_node, which will reset all the content of pgdat to 0, as the
pgdat is accessed lock-free, so that the users still using the pgdat
will panic, such as the vmstat_update routine.

process A:				offline node XX:

vmstat_updat()
   refresh_cpu_vm_stats()
     for_each_populated_zone()
       find online node XX
     cond_resched()
					offline cpu and memory, then try_offline_node()
					node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
       zone = next_zone(zone)
         pg_data_t *pgdat = zone->zone_pgdat;  // here pgdat is NULL now
           next_online_pgdat(pgdat)
             next_online_node(pgdat->node_id);  // NULL pointer access

So the solution here is postponing the reset of obsolete pgdat from
try_offline_node() to hotadd_new_pgdat(), and just resetting
pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
0 to avoid breaking pointer information in pgdat.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Xishi Qiu <qiuxishi@huawei.com>
Suggested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 1, 2016
commit 85bd839 upstream.

Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

	static inline bool zone_is_initialized(struct zone *zone)
	{
		return !!zone->wait_table;
	}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 2, 2016
commit 045c47c upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
[ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 42008884  XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
	int n;
	int status;
	int fd;
	char *buffer;
	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
	fd = open(CGPATH "/test/tasks", O_WRONLY);
	write(fd, buffer, n);
	close(fd);
	if (fork() > 0) {
		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
		read(fd, buffer, 512);
		close(fd);
		wait(&status);
	} else {
		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
		n = read(fd, buffer, BUFFER_SIZE);
		close(fd);
	}
	free(buffer);
	exit(0);
}

void test(void)
{
	int status;
	mkdir(CGPATH "/test", 0666);
	if (fork() > 0)
		wait(&status);
	else
		run(getpid());
	rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
	int i;
	for (i = 0; i < NR_TESTS; i++)
		test();
	return 0;
}

Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 8, 2016
commit 903124f upstream.

memset() to 0 interfaces array before reusing
usb_configuration structure.

This commit fix bug:

ln -s functions/acm.1 configs/c.1
ln -s functions/acm.2 configs/c.1
ln -s functions/acm.3 configs/c.1
echo "UDC name" > UDC
echo "" > UDC
rm configs/c.1/acm.*
rmdir functions/*
mkdir functions/ecm.usb0
ln -s functions/ecm.usb0 configs/c.1
echo "UDC name" > UDC

[   82.220969] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   82.229009] pgd = c0004000
[   82.231698] [00000000] *pgd=00000000
[   82.235260] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[   82.240638] Modules linked in:
[   82.243681] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc2 #39
[   82.249926] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   82.256003] task: c07cd2f0 ti: c07c8000 task.ti: c07c8000
[   82.261393] PC is at composite_setup+0xe3c/0x1674
[   82.266073] LR is at composite_setup+0xf20/0x1674
[   82.270760] pc : [<c03510d4>]    lr : [<c03511b8>]    psr: 600001d3
[   82.270760] sp : c07c9df0  ip : c0806448  fp : ed8c9c9c
[   82.282216] r10: 00000001  r9 : 00000000  r8 : edaae918
[   82.287425] r7 : ed551cc0  r6 : 00007fff  r5 : 00000000  r4 : ed799634
[   82.293934] r3 : 00000003  r2 : 00010002  r1 : edaae918  r0 : 0000002e
[   82.300446] Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[   82.307910] Control: 10c5387d  Table: 6bc1804a  DAC: 00000015
[   82.313638] Process swapper/0 (pid: 0, stack limit = 0xc07c8210)
[   82.319627] Stack: (0xc07c9df0 to 0xc07ca000)
[   82.323969] 9de0:                                     00000000 c06e65f4 00000000 c07c9f68
[   82.332130] 9e00: 00000067 c07c59ac 000003f7 edaae918 ed8c9c98 ed799690 eca2f140 200001d3
[   82.340289] 9e20: ee79a2d8 c07c9e88 c07c5304 ffff55db 00010002 edaae810 edaae860 eda96d50
[   82.348448] 9e40: 00000009 ee264510 00000007 c07ca444 edaae860 c0340890 c0827a40 ffff55e0
[   82.356607] 9e60: c0827a40 eda96e40 ee264510 edaae810 00000000 edaae860 00000007 c07ca444
[   82.364766] 9e80: edaae860 c0354170 c03407dc c033db4c edaae810 00000000 00000000 00000010
[   82.372925] 9ea0: 00000032 c0341670 00000000 00000000 00000001 eda96e00 00000000 00000000
[   82.381084] 9ec0: 00000000 00000032 c0803a23 ee1aa840 00000001 c005d54c 249e2450 00000000
[   82.389244] 9ee0: 200001d3 ee1aa840 ee1aa8a0 ed84f4c0 00000000 c07c9f68 00000067 c07c59ac
[   82.397403] 9f00: 00000000 c005d688 ee1aa840 ee1aa8a0 c07db4b4 c006009c 00000032 00000000
[   82.405562] 9f20: 00000001 c005ce20 c07c59ac c005cf34 f002000c c07ca780 c07c9f68 00000057
[   82.413722] 9f40: f0020000 413fc090 00000001 c00086b4 c000f804 60000053 ffffffff c07c9f9c
[   82.421880] 9f60: c0803a20 c0011fc0 00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.430040] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.438199] 9fa0: c000f800 c000f804 60000053 ffffffff 00000000 c0050e70 c0803bc0 c0783bd8
[   82.446358] 9fc0: ffffffff ffffffff c0783664 00000000 00000000 c07b13e8 00000000 c0803e54
[   82.454517] 9fe0: c07ca480 c07b13e4 c07ce40c 4000406a 00000000 40008074 00000000 00000000
[   82.462689] [<c03510d4>] (composite_setup) from [<c0340890>] (s3c_hsotg_complete_setup+0xb4/0x418)
[   82.471626] [<c0340890>] (s3c_hsotg_complete_setup) from [<c0354170>] (usb_gadget_giveback_request+0xc/0x10)
[   82.481429] [<c0354170>] (usb_gadget_giveback_request) from [<c033db4c>] (s3c_hsotg_complete_request+0xcc/0x12c)
[   82.491583] [<c033db4c>] (s3c_hsotg_complete_request) from [<c0341670>] (s3c_hsotg_irq+0x4fc/0x558)
[   82.500614] [<c0341670>] (s3c_hsotg_irq) from [<c005d54c>] (handle_irq_event_percpu+0x50/0x150)
[   82.509291] [<c005d54c>] (handle_irq_event_percpu) from [<c005d688>] (handle_irq_event+0x3c/0x5c)
[   82.518145] [<c005d688>] (handle_irq_event) from [<c006009c>] (handle_fasteoi_irq+0xd4/0x18c)
[   82.526650] [<c006009c>] (handle_fasteoi_irq) from [<c005ce20>] (generic_handle_irq+0x20/0x30)
[   82.535242] [<c005ce20>] (generic_handle_irq) from [<c005cf34>] (__handle_domain_irq+0x6c/0xdc)
[   82.543923] [<c005cf34>] (__handle_domain_irq) from [<c00086b4>] (gic_handle_irq+0x2c/0x6c)
[   82.552256] [<c00086b4>] (gic_handle_irq) from [<c0011fc0>] (__irq_svc+0x40/0x74)
[   82.559716] Exception stack(0xc07c9f68 to 0xc07c9fb0)
[   82.564753] 9f60:                   00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.572913] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.581069] 9fa0: c000f800 c000f804 60000053 ffffffff
[   82.586113] [<c0011fc0>] (__irq_svc) from [<c000f804>] (arch_cpu_idle+0x30/0x3c)
[   82.593491] [<c000f804>] (arch_cpu_idle) from [<c0050e70>] (cpu_startup_entry+0x128/0x1a4)
[   82.601740] [<c0050e70>] (cpu_startup_entry) from [<c0783bd8>] (start_kernel+0x350/0x3bc)
[   82.609890] Code: 0a000002 e3530005 05975010 15975008 (e5953000)
[   82.615965] ---[ end trace f57d5f599a5f1bfa ]---

Most of kernel code assume that interface array in
struct usb_configuration is NULL terminated.

When gadget is composed with configfs configuration
structure may be reused for different functions set.

This bug happens because purge_configs_funcs() sets
only next_interface_id to 0. Interface array still
contains pointers to already freed interfaces. If in
second try we add less interfaces than earlier we
may access unallocated memory when trying to get
interface descriptors.

Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 8, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit 32a71bc)
Git-commit: 41ba062
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
commit b0dc3a3 upstream.

Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
stress condition:

  BUG: unable to handle kernel paging request at 0000000000025f60
  IP: next_online_pgdat+0x1/0x50
  PGD 0
  Oops: 0000 [#1] SMP
  ACPI: Device does not support D3cold
  Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
  CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G           O 3.10.15-5885-euler0302 #1
  Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
  Workqueue: events vmstat_update
  task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
  RIP: 0010: next_online_pgdat+0x1/0x50
  RSP: 0018:ffffa800d32afce8  EFLAGS: 00010286
  RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
  RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
  RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
  R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
  R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
  FS:  0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
    refresh_cpu_vm_stats+0xd0/0x140
    vmstat_update+0x11/0x50
    process_one_work+0x194/0x3d0
    worker_thread+0x12b/0x410
    kthread+0xc6/0xd0
    ret_from_fork+0x7c/0xb0

The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
try_offline_node, which will reset all the content of pgdat to 0, as the
pgdat is accessed lock-free, so that the users still using the pgdat
will panic, such as the vmstat_update routine.

process A:				offline node XX:

vmstat_updat()
   refresh_cpu_vm_stats()
     for_each_populated_zone()
       find online node XX
     cond_resched()
					offline cpu and memory, then try_offline_node()
					node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
       zone = next_zone(zone)
         pg_data_t *pgdat = zone->zone_pgdat;  // here pgdat is NULL now
           next_online_pgdat(pgdat)
             next_online_node(pgdat->node_id);  // NULL pointer access

So the solution here is postponing the reset of obsolete pgdat from
try_offline_node() to hotadd_new_pgdat(), and just resetting
pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
0 to avoid breaking pointer information in pgdat.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Xishi Qiu <qiuxishi@huawei.com>
Suggested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
commit 85bd839 upstream.

Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

	static inline bool zone_is_initialized(struct zone *zone)
	{
		return !!zone->wait_table;
	}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
commit 045c47c upstream.

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
[ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 42008884  XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
	int n;
	int status;
	int fd;
	char *buffer;
	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
	fd = open(CGPATH "/test/tasks", O_WRONLY);
	write(fd, buffer, n);
	close(fd);
	if (fork() > 0) {
		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
		read(fd, buffer, 512);
		close(fd);
		wait(&status);
	} else {
		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
		n = read(fd, buffer, BUFFER_SIZE);
		close(fd);
	}
	free(buffer);
	exit(0);
}

void test(void)
{
	int status;
	mkdir(CGPATH "/test", 0666);
	if (fork() > 0)
		wait(&status);
	else
		run(getpid());
	rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
	int i;
	for (i = 0; i < NR_TESTS; i++)
		test();
	return 0;
}

Reported-by: Ricardo Marin Matinata <rmm@br.ibm.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
commit 903124f upstream.

memset() to 0 interfaces array before reusing
usb_configuration structure.

This commit fix bug:

ln -s functions/acm.1 configs/c.1
ln -s functions/acm.2 configs/c.1
ln -s functions/acm.3 configs/c.1
echo "UDC name" > UDC
echo "" > UDC
rm configs/c.1/acm.*
rmdir functions/*
mkdir functions/ecm.usb0
ln -s functions/ecm.usb0 configs/c.1
echo "UDC name" > UDC

[   82.220969] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   82.229009] pgd = c0004000
[   82.231698] [00000000] *pgd=00000000
[   82.235260] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[   82.240638] Modules linked in:
[   82.243681] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc2 #39
[   82.249926] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   82.256003] task: c07cd2f0 ti: c07c8000 task.ti: c07c8000
[   82.261393] PC is at composite_setup+0xe3c/0x1674
[   82.266073] LR is at composite_setup+0xf20/0x1674
[   82.270760] pc : [<c03510d4>]    lr : [<c03511b8>]    psr: 600001d3
[   82.270760] sp : c07c9df0  ip : c0806448  fp : ed8c9c9c
[   82.282216] r10: 00000001  r9 : 00000000  r8 : edaae918
[   82.287425] r7 : ed551cc0  r6 : 00007fff  r5 : 00000000  r4 : ed799634
[   82.293934] r3 : 00000003  r2 : 00010002  r1 : edaae918  r0 : 0000002e
[   82.300446] Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[   82.307910] Control: 10c5387d  Table: 6bc1804a  DAC: 00000015
[   82.313638] Process swapper/0 (pid: 0, stack limit = 0xc07c8210)
[   82.319627] Stack: (0xc07c9df0 to 0xc07ca000)
[   82.323969] 9de0:                                     00000000 c06e65f4 00000000 c07c9f68
[   82.332130] 9e00: 00000067 c07c59ac 000003f7 edaae918 ed8c9c98 ed799690 eca2f140 200001d3
[   82.340289] 9e20: ee79a2d8 c07c9e88 c07c5304 ffff55db 00010002 edaae810 edaae860 eda96d50
[   82.348448] 9e40: 00000009 ee264510 00000007 c07ca444 edaae860 c0340890 c0827a40 ffff55e0
[   82.356607] 9e60: c0827a40 eda96e40 ee264510 edaae810 00000000 edaae860 00000007 c07ca444
[   82.364766] 9e80: edaae860 c0354170 c03407dc c033db4c edaae810 00000000 00000000 00000010
[   82.372925] 9ea0: 00000032 c0341670 00000000 00000000 00000001 eda96e00 00000000 00000000
[   82.381084] 9ec0: 00000000 00000032 c0803a23 ee1aa840 00000001 c005d54c 249e2450 00000000
[   82.389244] 9ee0: 200001d3 ee1aa840 ee1aa8a0 ed84f4c0 00000000 c07c9f68 00000067 c07c59ac
[   82.397403] 9f00: 00000000 c005d688 ee1aa840 ee1aa8a0 c07db4b4 c006009c 00000032 00000000
[   82.405562] 9f20: 00000001 c005ce20 c07c59ac c005cf34 f002000c c07ca780 c07c9f68 00000057
[   82.413722] 9f40: f0020000 413fc090 00000001 c00086b4 c000f804 60000053 ffffffff c07c9f9c
[   82.421880] 9f60: c0803a20 c0011fc0 00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.430040] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.438199] 9fa0: c000f800 c000f804 60000053 ffffffff 00000000 c0050e70 c0803bc0 c0783bd8
[   82.446358] 9fc0: ffffffff ffffffff c0783664 00000000 00000000 c07b13e8 00000000 c0803e54
[   82.454517] 9fe0: c07ca480 c07b13e4 c07ce40c 4000406a 00000000 40008074 00000000 00000000
[   82.462689] [<c03510d4>] (composite_setup) from [<c0340890>] (s3c_hsotg_complete_setup+0xb4/0x418)
[   82.471626] [<c0340890>] (s3c_hsotg_complete_setup) from [<c0354170>] (usb_gadget_giveback_request+0xc/0x10)
[   82.481429] [<c0354170>] (usb_gadget_giveback_request) from [<c033db4c>] (s3c_hsotg_complete_request+0xcc/0x12c)
[   82.491583] [<c033db4c>] (s3c_hsotg_complete_request) from [<c0341670>] (s3c_hsotg_irq+0x4fc/0x558)
[   82.500614] [<c0341670>] (s3c_hsotg_irq) from [<c005d54c>] (handle_irq_event_percpu+0x50/0x150)
[   82.509291] [<c005d54c>] (handle_irq_event_percpu) from [<c005d688>] (handle_irq_event+0x3c/0x5c)
[   82.518145] [<c005d688>] (handle_irq_event) from [<c006009c>] (handle_fasteoi_irq+0xd4/0x18c)
[   82.526650] [<c006009c>] (handle_fasteoi_irq) from [<c005ce20>] (generic_handle_irq+0x20/0x30)
[   82.535242] [<c005ce20>] (generic_handle_irq) from [<c005cf34>] (__handle_domain_irq+0x6c/0xdc)
[   82.543923] [<c005cf34>] (__handle_domain_irq) from [<c00086b4>] (gic_handle_irq+0x2c/0x6c)
[   82.552256] [<c00086b4>] (gic_handle_irq) from [<c0011fc0>] (__irq_svc+0x40/0x74)
[   82.559716] Exception stack(0xc07c9f68 to 0xc07c9fb0)
[   82.564753] 9f60:                   00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.572913] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.581069] 9fa0: c000f800 c000f804 60000053 ffffffff
[   82.586113] [<c0011fc0>] (__irq_svc) from [<c000f804>] (arch_cpu_idle+0x30/0x3c)
[   82.593491] [<c000f804>] (arch_cpu_idle) from [<c0050e70>] (cpu_startup_entry+0x128/0x1a4)
[   82.601740] [<c0050e70>] (cpu_startup_entry) from [<c0783bd8>] (start_kernel+0x350/0x3bc)
[   82.609890] Code: 0a000002 e3530005 05975010 15975008 (e5953000)
[   82.615965] ---[ end trace f57d5f599a5f1bfa ]---

Most of kernel code assume that interface array in
struct usb_configuration is NULL terminated.

When gadget is composed with configfs configuration
structure may be reused for different functions set.

This bug happens because purge_configs_funcs() sets
only next_interface_id to 0. Interface array still
contains pointers to already freed interfaces. If in
second try we add less interfaces than earlier we
may access unallocated memory when trying to get
interface descriptors.

Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit 32a71bc)
Git-commit: 41ba062
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Martinusbe pushed a commit that referenced this pull request Feb 12, 2016
commit e4a60d1 upstream.

There is a race condition when removing glue directory.
It can be reproduced in following test:

path 1: Add first child device
device_add()
    get_device_parent()
            /*find parent from glue_dirs.list*/
            list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry)
                    if (k->parent == parent_kobj) {
                            kobj = kobject_get(k);
                            break;
                    }
            ....
            class_dir_create_and_add()

path2: Remove last child device under glue dir
device_del()
    cleanup_device_parent()
            cleanup_glue_dir()
                    kobject_put(glue_dir);

If path2 has been called cleanup_glue_dir(), but not
call kobject_put(glue_dir), the glue dir is still
in parent's kset list. Meanwhile, path1 find the glue
dir from the glue_dirs.list. Path2 may release glue dir
before path1 call kobject_get(). So kernel will report
the warning and bug_on.

This is a "classic" problem we have of a kref in a list
that can be found while the last instance could be removed
at the same time.

This patch reuse gdp_mutex to fix this race condition.

The following calltrace is captured in kernel 3.4, but
the latest kernel still has this bug.

-----------------------------------------------------
<4>[ 3965.441471] WARNING: at ...include/linux/kref.h:41 kobject_get+0x33/0x40()
<4>[ 3965.441474] Hardware name: Romley
<4>[ 3965.441475] Modules linked in: isd_iop(O) isd_xda(O)...
...
<4>[ 3965.441605] Call Trace:
<4>[ 3965.441611]  [<ffffffff8103717a>] warn_slowpath_common+0x7a/0xb0
<4>[ 3965.441615]  [<ffffffff810371c5>] warn_slowpath_null+0x15/0x20
<4>[ 3965.441618]  [<ffffffff81215963>] kobject_get+0x33/0x40
<4>[ 3965.441624]  [<ffffffff812d1e45>] get_device_parent.isra.11+0x135/0x1f0
<4>[ 3965.441627]  [<ffffffff812d22d4>] device_add+0xd4/0x6d0
<4>[ 3965.441631]  [<ffffffff812d0dbc>] ? dev_set_name+0x3c/0x40
....
<2>[ 3965.441912] kernel BUG at ..../fs/sysfs/group.c:65!
<4>[ 3965.441915] invalid opcode: 0000 [#1] SMP
...
<4>[ 3965.686743]  [<ffffffff811a677e>] sysfs_create_group+0xe/0x10
<4>[ 3965.686748]  [<ffffffff810cfb04>] blk_trace_init_sysfs+0x14/0x20
<4>[ 3965.686753]  [<ffffffff811fcabb>] blk_register_queue+0x3b/0x120
<4>[ 3965.686756]  [<ffffffff812030bc>] add_disk+0x1cc/0x490
....
-------------------------------------------------------

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls.  If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.

 # trace-cmd record -e raw_syscalls:* true && trace-cmd report
 ...
 true-653   [000]   384.675777: sys_enter:            NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
 true-653   [000]   384.675812: sys_exit:             NR 192 = 1995915264
 true-653   [000]   384.675971: sys_enter:            NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
 true-653   [000]   384.675988: sys_exit:             NR 983045 = 0
 ...

 # trace-cmd record -e syscalls:* true
 [   17.289329] Unable to handle kernel paging request at virtual address aaaaaace
 [   17.289590] pgd = 9e71c000
 [   17.289696] [aaaaaace] *pgd=00000000
 [   17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 [   17.290169] Modules linked in:
 [   17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
 [   17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
 [   17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
 [   17.290866] LR is at syscall_trace_enter+0x124/0x184

Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.

Commit cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.

Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in

Fixes: cd0980f "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
The __f2fs_add_link is covered by cp_rwsem all the time.
This calls init_inode_metadata, which conducts some acl operations including
memory allocation with GFP_KERNEL previously.
But, under memory pressure, f2fs_write_data_page can be called, which also
grabs cp_rwsem too.

In this case, this incurs a deadlock pointed by Chao.
Thread #1        Thread #2
 down_read
                 down_write
  down_read
 -> here down_read should wait forever.

Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
This patch doesn't make any effect on previous behavior, since
f2fs_write_data_page bypasses writing the page during POR.

But, the difference is that this patch avoids holding writepages mutex.
This is to avoid the following false warning, since this can happen only
when mount and shutdown are triggered at the same time.

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.0.0-rc1+ #3 Tainted: G           O
 -------------------------------------------------------
 kworker/u8:0/2270 is trying to acquire lock:
  (&sbi->gc_mutex){+.+.+.}, at: [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]

 but task is already holding lock:
  (&sbi->writepages){+.+...}, at: [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (&sbi->writepages){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02b261b>] f2fs_write_data_pages+0xcb/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126e23a>] writeback_single_inode+0xea/0x1c0
        [<ffffffff8126e425>] write_inode_now+0x95/0xa0
        [<ffffffff81259dab>] iput+0x20b/0x3f0
        [<ffffffffa02c1c8b>] recover_data.constprop.14+0x26b/0xa80 [f2fs]
        [<ffffffffa02c2776>] recover_fsync_data+0x2b6/0x5e0 [f2fs]
        [<ffffffffa02a9744>] f2fs_fill_super+0xb24/0xb90 [f2fs]
        [<ffffffff8123d7f4>] mount_bdev+0x1a4/0x1e0
        [<ffffffffa02a3c85>] f2fs_mount+0x15/0x20 [f2fs]
        [<ffffffff8123e159>] mount_fs+0x39/0x180
        [<ffffffff8125e51b>] vfs_kern_mount+0x6b/0x160
        [<ffffffff81261554>] do_mount+0x204/0xbe0
        [<ffffffff8126223b>] SyS_mount+0x8b/0xe0
        [<ffffffff81863e6d>] system_call_fastpath+0x16/0x1b

 -> #1 (&sbi->cp_mutex){+.+...}:
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02acbf2>] write_checkpoint+0x42/0x1230 [f2fs]
        [<ffffffffa02a847d>] f2fs_sync_fs+0x9d/0x2a0 [f2fs]
        [<ffffffff81272f82>] sync_filesystem+0x82/0xb0
        [<ffffffff8123c214>] generic_shutdown_super+0x34/0x100
        [<ffffffff8123c5f7>] kill_block_super+0x27/0x70
        [<ffffffffa02a3c60>] kill_f2fs_super+0x20/0x30 [f2fs]
        [<ffffffff8123ca49>] deactivate_locked_super+0x49/0x80
        [<ffffffff8123d05e>] deactivate_super+0x4e/0x70
        [<ffffffff8125df63>] cleanup_mnt+0x43/0x90
        [<ffffffff8125e002>] __cleanup_mnt+0x12/0x20
        [<ffffffff810a82e4>] task_work_run+0xc4/0xf0
        [<ffffffff8101f0bd>] do_notify_resume+0x8d/0xa0
        [<ffffffff81864141>] int_signal+0x12/0x17

 -> #0 (&sbi->gc_mutex){+.+.+.}:
        [<ffffffff810e2866>] __lock_acquire+0x1ac6/0x1c90
        [<ffffffff810e2b11>] lock_acquire+0xe1/0x2f0
        [<ffffffff8185e1b3>] mutex_lock_nested+0x63/0x530
        [<ffffffffa02bdd33>] f2fs_balance_fs+0x73/0x90 [f2fs]
        [<ffffffffa02b5938>] f2fs_write_data_page+0x348/0x5b0 [f2fs]
        [<ffffffffa02af9da>] __f2fs_writepage+0x1a/0x50 [f2fs]
        [<ffffffff811c1b54>] write_cache_pages+0x274/0x6f0
        [<ffffffffa02b2630>] f2fs_write_data_pages+0xe0/0x3a0 [f2fs]
        [<ffffffff811c38c1>] do_writepages+0x21/0x50
        [<ffffffff8126c5a6>] __writeback_single_inode+0x76/0xbf0
        [<ffffffff8126d44a>] writeback_sb_inodes+0x32a/0x710
        [<ffffffff8126d8cf>] __writeback_inodes_wb+0x9f/0xd0
        [<ffffffff8126dcdb>] wb_writeback+0x3db/0x850
        [<ffffffff8126e848>] bdi_writeback_workfn+0x148/0x980
        [<ffffffff810a3782>] process_one_work+0x1e2/0x840
        [<ffffffff810a3f01>] worker_thread+0x121/0x460
        [<ffffffff810a9dc8>] kthread+0xf8/0x110
        [<ffffffff81863dbc>] ret_from_fork+0x7c/0xb0

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
We will encounter oops by executing below command.
getfattr -n system.advise /mnt/f2fs/file
Killed

message log:
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs]
*pdpt = 00000000319b7001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
Modules linked in: f2fs(O) snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev
snd_seq_device snd_timer bnep snd rfcomm microcode bluetooth soundcore i2c_piix4 mac_hid serio_raw parport_pc ppdev lp parport
binfmt_misc hid_generic psmouse usbhid hid e1000 [last unloaded: f2fs]
CPU: 3 PID: 3134 Comm: getfattr Tainted: G           O    4.0.0-rc1 #6
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: f3a71b60 ti: f19a6000 task.ti: f19a6000
EIP: 0060:[<f8b54d69>] EFLAGS: 00010246 CPU: 3
EIP is at f2fs_xattr_advise_get+0x29/0x40 [f2fs]
EAX: 00000000 EBX: f19a7e71 ECX: 00000000 EDX: f8b5b467
ESI: 00000000 EDI: f2008570 EBP: f19a7e14 ESP: f19a7e08
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: 00000000 CR3: 319b8000 CR4: 000007f0
Stack:
 f8b5a634 c0cbb580 00000000 f19a7e34 c1193850 00000000 00000007 f19a7e71
 f19a7e64 c0cbb580 c1193810 f19a7e50 c1193c00 00000000 00000000 00000000
 c0cbb580 00000000 f19a7f70 c1194097 00000000 00000000 00000000 74737973
Call Trace:
 [<c1193850>] generic_getxattr+0x40/0x50
 [<c1193810>] ? xattr_resolve_name+0x80/0x80
 [<c1193c00>] vfs_getxattr+0x70/0xa0
 [<c1194097>] getxattr+0x87/0x190
 [<c11801d7>] ? path_lookupat+0x57/0x5f0
 [<c11819d2>] ? putname+0x32/0x50
 [<c116653a>] ? kmem_cache_alloc+0x2a/0x130
 [<c11819d2>] ? putname+0x32/0x50
 [<c11819d2>] ? putname+0x32/0x50
 [<c11819d2>] ? putname+0x32/0x50
 [<c11827f9>] ? user_path_at_empty+0x49/0x70
 [<c118283f>] ? user_path_at+0x1f/0x30
 [<c11941e7>] path_getxattr+0x47/0x80
 [<c11948e7>] SyS_getxattr+0x27/0x30
 [<c163f748>] sysenter_do_call+0x12/0x12
Code: 66 90 55 89 e5 57 56 53 66 66 66 66 90 8b 78 20 89 d3 ba 67 b4 b5 f8 89 d8 89 ce e8 42 7c 7b c8 85 c0 75 16 0f b6 87 44 01 00
00 <88> 06 b8 01 00 00 00 5b 5e 5f 5d c3 8d 76 00 b8 ea ff ff ff eb
EIP: [<f8b54d69>] f2fs_xattr_advise_get+0x29/0x40 [f2fs] SS:ESP 0068:f19a7e08
CR2: 0000000000000000
---[ end trace 860260654f1f416a ]---

The reason is that in getfattr there are two steps which is indicated by strace info:
1) try to lookup and get size of specified xattr.
2) get value of the extented attribute.

strace info:
getxattr("/mnt/f2fs/file", "system.advise", 0x0, 0) = 1
getxattr("/mnt/f2fs/file", "system.advise", "\x00", 256) = 1

For the first step, getfattr may pass a NULL pointer in @value and zero in @SiZe
as parameters for ->getxattr, but we access this @value pointer directly without
checking whether the pointer is valid or not in f2fs_xattr_advise_get, so the
oops occurs.

This patch fixes this issue by verifying @value pointer before using.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
If a user key gets negatively instantiated, an error code is cached in the
payload area.  A negatively instantiated key may be then be positively
instantiated by updating it with valid data.  However, the ->update key
type method must be aware that the error code may be there.

The following may be used to trigger the bug in the user key type:

    keyctl request2 user user "" @U
    keyctl add user user "a" @U

which manifests itself as:

	BUG: unable to handle kernel paging request at 00000000ffffff8a
	IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	PGD 7cc30067 PUD 0
	Oops: 0002 [#1] SMP
	Modules linked in:
	CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
	task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000
	RIP: 0010:[<ffffffff810a376f>]  [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280
	 [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046
	RSP: 0018:ffff88003dd8bdb0  EFLAGS: 00010246
	RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001
	RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82
	RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000
	R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82
	R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700
	FS:  0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
	CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0
	Stack:
	 ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82
	 ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5
	 ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620
	Call Trace:
	 [<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
	 [<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129
	 [<     inline     >] __key_update security/keys/key.c:730
	 [<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908
	 [<     inline     >] SYSC_add_key security/keys/keyctl.c:125
	 [<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
	 [<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185

Note the error code (-ENOKEY) in EDX.

A similar bug can be tripped by:

    keyctl request2 trusted user "" @U
    keyctl add trusted user "a" @U

This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.

Change-Id: I6452091ee01e324c934cf0c7d5a0f3ffa5d0b001
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
commit 903124f upstream.

memset() to 0 interfaces array before reusing
usb_configuration structure.

This commit fix bug:

ln -s functions/acm.1 configs/c.1
ln -s functions/acm.2 configs/c.1
ln -s functions/acm.3 configs/c.1
echo "UDC name" > UDC
echo "" > UDC
rm configs/c.1/acm.*
rmdir functions/*
mkdir functions/ecm.usb0
ln -s functions/ecm.usb0 configs/c.1
echo "UDC name" > UDC

[   82.220969] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   82.229009] pgd = c0004000
[   82.231698] [00000000] *pgd=00000000
[   82.235260] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[   82.240638] Modules linked in:
[   82.243681] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0-rc2 #39
[   82.249926] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   82.256003] task: c07cd2f0 ti: c07c8000 task.ti: c07c8000
[   82.261393] PC is at composite_setup+0xe3c/0x1674
[   82.266073] LR is at composite_setup+0xf20/0x1674
[   82.270760] pc : [<c03510d4>]    lr : [<c03511b8>]    psr: 600001d3
[   82.270760] sp : c07c9df0  ip : c0806448  fp : ed8c9c9c
[   82.282216] r10: 00000001  r9 : 00000000  r8 : edaae918
[   82.287425] r7 : ed551cc0  r6 : 00007fff  r5 : 00000000  r4 : ed799634
[   82.293934] r3 : 00000003  r2 : 00010002  r1 : edaae918  r0 : 0000002e
[   82.300446] Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[   82.307910] Control: 10c5387d  Table: 6bc1804a  DAC: 00000015
[   82.313638] Process swapper/0 (pid: 0, stack limit = 0xc07c8210)
[   82.319627] Stack: (0xc07c9df0 to 0xc07ca000)
[   82.323969] 9de0:                                     00000000 c06e65f4 00000000 c07c9f68
[   82.332130] 9e00: 00000067 c07c59ac 000003f7 edaae918 ed8c9c98 ed799690 eca2f140 200001d3
[   82.340289] 9e20: ee79a2d8 c07c9e88 c07c5304 ffff55db 00010002 edaae810 edaae860 eda96d50
[   82.348448] 9e40: 00000009 ee264510 00000007 c07ca444 edaae860 c0340890 c0827a40 ffff55e0
[   82.356607] 9e60: c0827a40 eda96e40 ee264510 edaae810 00000000 edaae860 00000007 c07ca444
[   82.364766] 9e80: edaae860 c0354170 c03407dc c033db4c edaae810 00000000 00000000 00000010
[   82.372925] 9ea0: 00000032 c0341670 00000000 00000000 00000001 eda96e00 00000000 00000000
[   82.381084] 9ec0: 00000000 00000032 c0803a23 ee1aa840 00000001 c005d54c 249e2450 00000000
[   82.389244] 9ee0: 200001d3 ee1aa840 ee1aa8a0 ed84f4c0 00000000 c07c9f68 00000067 c07c59ac
[   82.397403] 9f00: 00000000 c005d688 ee1aa840 ee1aa8a0 c07db4b4 c006009c 00000032 00000000
[   82.405562] 9f20: 00000001 c005ce20 c07c59ac c005cf34 f002000c c07ca780 c07c9f68 00000057
[   82.413722] 9f40: f0020000 413fc090 00000001 c00086b4 c000f804 60000053 ffffffff c07c9f9c
[   82.421880] 9f60: c0803a20 c0011fc0 00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.430040] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.438199] 9fa0: c000f800 c000f804 60000053 ffffffff 00000000 c0050e70 c0803bc0 c0783bd8
[   82.446358] 9fc0: ffffffff ffffffff c0783664 00000000 00000000 c07b13e8 00000000 c0803e54
[   82.454517] 9fe0: c07ca480 c07b13e4 c07ce40c 4000406a 00000000 40008074 00000000 00000000
[   82.462689] [<c03510d4>] (composite_setup) from [<c0340890>] (s3c_hsotg_complete_setup+0xb4/0x418)
[   82.471626] [<c0340890>] (s3c_hsotg_complete_setup) from [<c0354170>] (usb_gadget_giveback_request+0xc/0x10)
[   82.481429] [<c0354170>] (usb_gadget_giveback_request) from [<c033db4c>] (s3c_hsotg_complete_request+0xcc/0x12c)
[   82.491583] [<c033db4c>] (s3c_hsotg_complete_request) from [<c0341670>] (s3c_hsotg_irq+0x4fc/0x558)
[   82.500614] [<c0341670>] (s3c_hsotg_irq) from [<c005d54c>] (handle_irq_event_percpu+0x50/0x150)
[   82.509291] [<c005d54c>] (handle_irq_event_percpu) from [<c005d688>] (handle_irq_event+0x3c/0x5c)
[   82.518145] [<c005d688>] (handle_irq_event) from [<c006009c>] (handle_fasteoi_irq+0xd4/0x18c)
[   82.526650] [<c006009c>] (handle_fasteoi_irq) from [<c005ce20>] (generic_handle_irq+0x20/0x30)
[   82.535242] [<c005ce20>] (generic_handle_irq) from [<c005cf34>] (__handle_domain_irq+0x6c/0xdc)
[   82.543923] [<c005cf34>] (__handle_domain_irq) from [<c00086b4>] (gic_handle_irq+0x2c/0x6c)
[   82.552256] [<c00086b4>] (gic_handle_irq) from [<c0011fc0>] (__irq_svc+0x40/0x74)
[   82.559716] Exception stack(0xc07c9f68 to 0xc07c9fb0)
[   82.564753] 9f60:                   00000000 00000000 c07c9fb8 c001bee0 c07ca4f0 c057004c
[   82.572913] 9f80: c07ca4fc c0803a20 c0803a20 413fc090 00000001 00000000 01000000 c07c9fb0
[   82.581069] 9fa0: c000f800 c000f804 60000053 ffffffff
[   82.586113] [<c0011fc0>] (__irq_svc) from [<c000f804>] (arch_cpu_idle+0x30/0x3c)
[   82.593491] [<c000f804>] (arch_cpu_idle) from [<c0050e70>] (cpu_startup_entry+0x128/0x1a4)
[   82.601740] [<c0050e70>] (cpu_startup_entry) from [<c0783bd8>] (start_kernel+0x350/0x3bc)
[   82.609890] Code: 0a000002 e3530005 05975010 15975008 (e5953000)
[   82.615965] ---[ end trace f57d5f599a5f1bfa ]---

Most of kernel code assume that interface array in
struct usb_configuration is NULL terminated.

When gadget is composed with configfs configuration
structure may be reused for different functions set.

This bug happens because purge_configs_funcs() sets
only next_interface_id to 0. Interface array still
contains pointers to already freed interfaces. If in
second try we add less interfaces than earlier we
may access unallocated memory when trying to get
interface descriptors.

Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martinusbe pushed a commit that referenced this pull request Feb 23, 2016
While disabling ConfigFS Android gadget, android_disconnect() calls
kill_all_hid_devices(), if CONFIG_USB_CONFIGFS_F_ACC is enabled, to free
the registered HIDs without checking whether the USB accessory device
really exist or not. If USB accessory device doesn't exist then we run into
following kernel panic:
----8<----
[  136.724761] Unable to handle kernel NULL pointer dereference at virtual address 00000064
[  136.724809] pgd = c0204000
[  136.731924] [00000064] *pgd=00000000
[  136.737830] Internal error: Oops: 5 [#1] SMP ARM
[  136.738108] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc4-00400-gf75300e-dirty #76
[  136.742788] task: c0fb19d8 ti: c0fa4000 task.ti: c0fa4000
[  136.750890] PC is at _raw_spin_lock_irqsave+0x24/0x60
[  136.756246] LR is at kill_all_hid_devices+0x24/0x114
---->8----

This patch adds a test to check if USB Accessory device exists before freeing HIDs.

Change-Id: Ie229feaf0de3f4f7a151fcaa9a994e34e15ff73b
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
(cherry picked from commit 32a71bc)
Git-commit: 41ba062
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Ian Maund <imaund@codeaurora.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.