487 Commits

Author SHA1 Message Date
securecrt
1672c9446f msm: kgsl: Optimize page_alloc allocations
User memory needs to be zeroed out before it is sent to the user.
To do this, the kernel maps the page, memsets it to zero and then
unmaps it.  By virtue of mapping it, this forces us to flush the
dcache to ensure cache coherency between kernel and user mappings.
Originally, the page_alloc loop was using GFP_ZERO (which does a
map, memset, and unmap for each individual page) and then we were
additionally calling flush_dcache_page() for each page killing us
on performance.  It is far more efficient, especially for large
allocations (> 1MB), to allocate the pages without GFP_ZERO and
then to vmap the entire allocation, memset it to zero, flush the
cache and then unmap. This process is slightly slower for very
small allocations, but only by a few microseconds, and is well
within the margin of acceptability. In all, the new scheme is
faster than the default for all sizes greater than 16k, and is
almost 4X faster for 2MB and 4MB allocations which are common for
textures and very large buffer objects.

The downside is that if there isn't enough vmalloc room for the
allocation that we are forced to fallback to a slow page by
page memset/flush, but this should happen rarely (if at all) and
is only included for completeness.
2012-07-26 14:45:24 +08:00
securecrt
394bda433a msm: kgsl: Map a guard page on the back of GPU MMU regions
Add a guard page on the backside of page_alloc MMU mappings to protect
against an over zealous GPU pre-fetch engine that sometimes oversteps the
end of the mapped region. The same phsyical page can be re-used for each
mapping so we only need to allocate one phsyical page to rule them all
and in the darkness bind them.
2012-07-26 14:04:25 +08:00
securecrt
4822aef009 msm: kgsl: Change name of vmalloc allocator
Change the vmalloc allocation name to something more appropriate since
we do not allocate memory using vmalloc for userspace driver. We
directly allocate physical pages and map that to user address space. The
name is changed to page_alloc instead of vmalloc. Add sysfs files to
track memory usage via both vmalloc and page_alloc.
2012-07-26 13:52:28 +08:00
securecrt
e2ff78936f msm: kgsl: Do not dereference pointer before checking against NULL
The pagetable pointer was checked against NULL after being used.
Check against NULL first and then dereference it.
2012-07-25 21:10:10 +08:00
securecrt
41b9064ec2 msm: kgsl: don't clear gpuaddr when unmapping global mappings
Memory mapped through kgsl_mmu_map_global() is supposed to have
the same gpu address in all pagetables. And the memdesc will
persist beyond the lifetime of any single pagetable.
Therefore, memdesc->gpuaddr should not be zeroed for these
memdescs.
2012-07-25 21:08:59 +08:00
securecrt
121a2a91a5 msm: kgsl: Add GMEM size configuration in gpu list
To avoid msm or gpu specific code in the driver, added
GMEM size configuration parameter as a part of gpu list.
2012-07-25 20:39:13 +08:00
securecrt
efa80a4cc1 msm: kgsl: Cleanup header file macros
Remove macro logic for macros that are always defined.
2012-07-25 20:27:26 +08:00
securecrt
503977ed6b fix #4151332 2012-07-25 20:23:24 +08:00
securecrt
15793c0aaa msm: kgsl: Find a mem_entry by way of a GPU address and a pagetable base
Given a pagetable base and a GPU address, find the struct kgsl_mem_entry
that matches the object.  Move this functionality out from inside another
function and promote it to top level so it can be used by upcoming
functionality.
2012-07-25 19:54:21 +08:00
securecrt
41513329a1 msm: kgsl: Detach memory objects from a process ahead of destroy time
Previously, memory objects assumed that they remained attached to a
process until they are destroyed. In the past this was mostly true,
but worked by luck because a process could technically map the memory
and then close the file descriptor which would eventually explode. Now we
do the process related cleanup (MMU unmap, fixup statistics) when the
object is released from the process so the process can go away without
affecting the other holders of the mem object refcount.
2012-07-25 19:47:35 +08:00
securecrt
93d86da2ee msm: kgsl: handle all indirect buffer types in postmortem
Postmortem dump was not parsing CP_INDIRECT_BUFFER_PFE commands.
Snapshot was recently fixed to handle this, and this change
extends support to postmortem dump.
2012-07-25 19:41:35 +08:00
securecrt
543247cd01 msm: kgsl: return correct error code for unknown ioctls
Unknown ioctl code errors are supposed to be ENOIOCTLCMD,
not EINVAL.
2012-07-25 19:35:35 +08:00
securecrt
1b6fa28430 msm: kgsl: Update the GMEM and istore size for A320
Set the correct GMEM and istore sizes for A320 on APQ8064.
The more GMEM we have the happier we are, so the code will
work with 256K, but it will be better with 512K.  For the
instruction store the size is important during GPU snapshot
and postmortem dump.  Also, the size of each instruction is
different on A3XX so remove the hard coded constants and
add a GPU specific size variable.
2012-07-25 19:14:12 +08:00
securecrt
411b4bcb90 reduced the PMEM_ADSP size as the HW decoder still can't work on HD2 2012-07-25 19:12:41 +08:00
SecureCRT
0885149512 msm: kgsl: Add support for the A3XX family of GPUs
Add support for the A320, the first of the new generation
of Adreno GPUs.
2012-07-25 00:10:26 +08:00
SecureCRT
be4c38e2f5 msm: kgsl: handle larger instruction store for adreno225
This GPU has a larger instruction store, so more memory
needs to be reserved for saving shader state when context
switching.

The initial vertex and pixel partitioning of the
instruction store also needs to be different.
2012-07-24 23:30:19 +08:00
securecrt
ee339b2bcb msm: kgsl: Write the retired timestamp on resume
Write the retired timestamp into the expected location. This fixes
userspace crashes after resume when the retired timestamp is read
as 0 instead of the expected last timestamp.
2012-07-23 18:59:50 +08:00
securecrt
148ebef127 reverse DEBUG_TRACE_VDEC 2012-07-23 14:37:40 +08:00
securecrt
544a54b32b ignore the version check 2012-07-23 14:13:02 +08:00
SecureCRT
b8450f4096 msm: kgsl: change timestamp frees to use kgsl_event
The timestamp memqueue was unsorted, which could cause
memory to not be freed soon enough. The kgsl_event
list is sorted and does almost exactly the same thing
as the memqueue did, so freememontimestamp is now
implemented using the kgsl_event list.
2012-06-23 19:03:55 +08:00
SecureCRT
4520a7c383 msm: kgsl: cancel events from kgsl_release
Events need to be cancelled when an fd is released,
to avoid possible memory leaks or use after free.

When the event is cancelled, its callback is called.
Currently this is sufficient since events are used for
resource management and we have no option but to
release the lock or memory. If future uses need to
distinguish between the callback firing and
a cancel, they can look at the timestamp passed to
the callback, which will be before the timestamp they
expected. Otherwise a separate cancel callback can
be added.
2012-06-23 18:52:06 +08:00
SecureCRT
f6acf3ab9f msm: kgsl: queue timestamp expired work more often
There are a some workloads where interrupts do not
always get generated, and as a result the timestamp
work was not triggered often enough.

Queue timestamp expired work from adreno_waittimestamp(),
when the timestamp expires while we are not waiting.
It is possible in this case that no interrupt fired
because no processes were waiting.

Queue timestamp expired work when freememontimestamp
is called, which reduces the amount of memory
built up by applications that use this api often.
2012-06-23 17:48:20 +08:00
SecureCRT
5c1047c767 msm: kgsl: set the dma_address field of scatterlists
Ion carveout and content protect heap buffers do not
have a struct page associated with them. Thus
sg_phys() will not work reliably on these buffers.
Set the dma_address field on physically contiguous
buffers.  When mapping a scatterlist to the gpummu
use sg_dma_address() first and if it returns 0
then use sg_phys().

msm: kgsl: Use kzalloc to allocate scatterlists of 1 page or less

The majority of the scatterlist allocations used in KGSL are under 1
page (1 page of struct scatterlist is approximately 1024 entries
equalling 4MB of allocated buffer).  In these cases using vmalloc
for the sglist is undesirable and slow.  Add functions to check the
size of the allocation and favor kzalloc for 1 page allocations and
vmalloc for larger lists.
2012-06-23 17:02:28 +08:00
SecureCRT
a7bb935abb revert the pmem size to default configration 2012-06-23 17:01:57 +08:00
SecureCRT
361e591fe7 msm: kgsl: remove readl/writel use for dma memory
For dma_alloc_coherent() you don't need writel/readl because
it's just a plain old void *. Linux tries very hard to make a
distinction between io memory (void __iomem *) and memory
(void *) so that drivers are portable to architectures that
don't have a way to access registers via pointer dereferences.
You can see http://lwn.net/Articles/102232/ and the Linus rant
http://lwn.net/Articles/102240/ here for more details behind
the motivation.

msm: kgsl: Allocate physical pages instead of using vmalloc

Replace vmalloc allocation with physical page allocation. For most
allocations we do not need a kernel virual address. vmalloc uses up
the kernel virtual address space. By replacing vmalloc with physical
page alloction and mapping that allocation to kernel space only
when it is required prevents the kgsl driver from using unnecessary
vmalloc virtual space.
2012-06-22 16:49:00 +08:00
SecureCRT
8c39724a75 remove zImage before compile 2012-06-22 16:48:37 +08:00
SecureCRT
47e6ec131b reverse the GENLOCK 2012-06-22 16:20:22 +08:00
SecureCRT
376f66c119 msm: kgsl: convert sg allocation to vmalloc
kmalloc allocates physically contiguous memory and
may fail for larger allocations due to fragmentation.
The large allocations are caused by the fact that the
scatterlist structure is 24 bytes and the array size
is proportional to the number of pages being mapped.
2012-06-22 16:08:12 +08:00
SecureCRT
b4c5202bec msm: kgsl: make cffdump work with the MMU enabled
The tools that process cff dumps expect a linear
memory region, but the start address of that region can
be configured. As long as there is only a single
pagetable (so that there aren't duplicate virtual
addresses in the dump), dumps captured with the
mmu on are easier to deal with than reconfiguring
to turn the mmu off.
2012-06-22 15:38:14 +08:00
SecureCRT
a19d2698cc msm: kgsl: Add ION as an external memory source
Allow ION buffers to be attached via IOCTL_KGSL_MAP_USER_MEM
2012-06-22 15:24:51 +08:00
securecrt
91bbe54c4f msm: kgsl: Fixup per-process memory statistics
Make the framework for reporting per-process memory statistics a little bit
more generic.  This should make it easier to keep track of more external
memory sources as they are added.
2012-06-21 13:41:21 +08:00
securecrt
9d909cf27b msm: kgsl: Make sure kmemleak tool does not report incorrect mem leak.
Certain memory allocations are not properly tracked by kmemleak tool,
which makes it to incorrectly detect memory leak. Notify the tool by using
kmemleak_not_leak() to ignore the memory allocation so that incorrect leaks
report are avoided.
2012-06-21 13:01:23 +08:00
securecrt
dcf924f072 msm: kgsl: Add a new property to IOCTL_KGSL_DEVICE_GETPROPERTY
Return the reset status of the GPU unit when
IOCTL_KGSL_DEVICE_GETPROPERTY is called with
type KGSL_PROP_GPU_RESET_STAT
2012-06-21 12:54:12 +08:00
securecrt
69555a62d1 msm: kgsl: Poke regularly in adreno_idle
Poking once during adreno_idle is not enough; a GPU hang may still happen.
Seen on 7x27A. Write a few times during the wait timeout, to ensure that
the WPTR is updated properly.
2012-06-21 12:46:57 +08:00
securecrt
aa5de9cfcb msm: kgsl: increase valid timestamp range
The existing timestamp_cmp function returns a different
result depending on the order of the input parameters due to
having an asymetric valid window. When no rollover is
detected the window is 2^31 but when a rollover is detected
the window is 25000. This change makes the rollover window
symmetric at 2^31.
2012-06-21 12:34:57 +08:00
securecrt
d319fcfbbd msm: kgsl: flush outer cache for alloc_page() pages
The outer cache needs to be flushed for these pages
after they are allocated so that the GPU and CPU
have a consistent view of them.
2012-06-21 12:30:20 +08:00
SecureCRT
97dd7fe6b5 msm: kgsl: Add a constant for adreno_ringbuffer_issuecmds flags
Use a #define constant instead of a bare constant for the flags
parameter of adreno_ringbuffer_issuecmds.
2012-06-21 00:32:58 +08:00
SecureCRT
ae32a212a5 msm: kgsl: fix error handling in adreno_waittimestamp()
This function was incorrectly reporting hangs when an
error such as ERESTARTSYS was returned by
__wait_event_interruptible_timeout().

msm: kgsl: Make sure WPTR reg is updated properly

Sometimes writes to WPTR register do not take effect, causing a
3D core hang. Make sure the WPTR is updated properly when waiting.

msm: kgsl: Set default value of wait_timeout in the adreno_dev struct

Set the initalization value of wait_timeout at compile time in the
declaration of the adreno_device struct instead of at runtime in
adreno_probe.
2012-06-21 00:02:15 +08:00
securecrt
73aff24078 msm: kgsl: fix size checking in adreno_find_region
This function is supposed to return the memdesc that
contains the range gpuaddr to gpuaddr + size. One of the
lookups was using sizeof(unsigned int) instead of size,
which could cause false positive results from this function
and possibly kernel panics in the snapshot or postmortem
code, which rely on it to do bounds checking for them.
2012-06-20 12:39:35 +08:00
securecrt
fd5e7d8237 msm: kgsl: let postmortem dump find context switch IBs
Because the IBs used for context switching are not allocated
by userspace, a separate search is needed to find them
in adreno_find_region.
2012-06-20 12:25:12 +08:00
SecureCRT
c5ac3240a5 msm: kgsl: improve postmortem and cff bounds checking
Some hangs are fooling the postmortem dump code into
running off the end of a buffer. Fix this by making
its bounds check logic work better by reusing the
logic from kgsl_find_region().
2012-06-19 23:30:34 +08:00
SecureCRT
8be096244d msm: kgsl: Fix when GMEM is saved for A2xx
Saving GMEM is set when doing context switching and should not
be set when creating the gmem shadow.
2012-06-19 21:46:18 +08:00
securecrt
2f3f4d14f9 msm: kgsl: Add support for the preamble context flag
Userspace will set a flag in the context if preambles are in use. If
they are, we can safely skip save and restore commands for the
context. GMEM save/restore is still required.  To improve performance,
preamble commands are skipped when the context hasn't changed since
the last issueibcmds.

from Code Aurora
2012-06-19 14:00:07 +08:00
SecureCRT
cad19fbe99 change the build batch file 2012-06-19 01:38:16 +08:00
SecureCRT
83cf3269bc add more sf_pmem to prevent memory full 2012-06-19 01:37:29 +08:00
SecureCRT
758812c3aa fixed the adsp pmem is too low for camera 2012-06-18 23:52:45 +08:00
securecrt
1bd0e44d7a reduced the pmem size to save memory for userspace, TEST ONLY!! 2012-06-18 20:31:47 +08:00
securecrt
4f50d63951 msm: kgsl: fix format of the rbbm read error message
msm: kgsl: Assign a valid context only after one has been restored
2012-06-18 20:28:17 +08:00
SecureCRT
d0bde07fa4 set ALLORNOTHING allocator for mdp heap 2012-06-05 00:12:26 +08:00
SecureCRT
32f796ad5c compress boot and system dir only 2012-06-02 16:34:51 +08:00