Given a pagetable base and a GPU address, find the struct kgsl_mem_entry
that matches the object. Move this functionality out from inside another
function and promote it to top level so it can be used by upcoming
functionality.
Set the correct GMEM and istore sizes for A320 on APQ8064.
The more GMEM we have the happier we are, so the code will
work with 256K, but it will be better with 512K. For the
instruction store the size is important during GPU snapshot
and postmortem dump. Also, the size of each instruction is
different on A3XX so remove the hard coded constants and
add a GPU specific size variable.
This GPU has a larger instruction store, so more memory
needs to be reserved for saving shader state when context
switching.
The initial vertex and pixel partitioning of the
instruction store also needs to be different.
The timestamp memqueue was unsorted, which could cause
memory to not be freed soon enough. The kgsl_event
list is sorted and does almost exactly the same thing
as the memqueue did, so freememontimestamp is now
implemented using the kgsl_event list.
There are a some workloads where interrupts do not
always get generated, and as a result the timestamp
work was not triggered often enough.
Queue timestamp expired work from adreno_waittimestamp(),
when the timestamp expires while we are not waiting.
It is possible in this case that no interrupt fired
because no processes were waiting.
Queue timestamp expired work when freememontimestamp
is called, which reduces the amount of memory
built up by applications that use this api often.
For dma_alloc_coherent() you don't need writel/readl because
it's just a plain old void *. Linux tries very hard to make a
distinction between io memory (void __iomem *) and memory
(void *) so that drivers are portable to architectures that
don't have a way to access registers via pointer dereferences.
You can see http://lwn.net/Articles/102232/ and the Linus rant
http://lwn.net/Articles/102240/ here for more details behind
the motivation.
msm: kgsl: Allocate physical pages instead of using vmalloc
Replace vmalloc allocation with physical page allocation. For most
allocations we do not need a kernel virual address. vmalloc uses up
the kernel virtual address space. By replacing vmalloc with physical
page alloction and mapping that allocation to kernel space only
when it is required prevents the kgsl driver from using unnecessary
vmalloc virtual space.
kmalloc allocates physically contiguous memory and
may fail for larger allocations due to fragmentation.
The large allocations are caused by the fact that the
scatterlist structure is 24 bytes and the array size
is proportional to the number of pages being mapped.
The tools that process cff dumps expect a linear
memory region, but the start address of that region can
be configured. As long as there is only a single
pagetable (so that there aren't duplicate virtual
addresses in the dump), dumps captured with the
mmu on are easier to deal with than reconfiguring
to turn the mmu off.
Poking once during adreno_idle is not enough; a GPU hang may still happen.
Seen on 7x27A. Write a few times during the wait timeout, to ensure that
the WPTR is updated properly.
This function was incorrectly reporting hangs when an
error such as ERESTARTSYS was returned by
__wait_event_interruptible_timeout().
msm: kgsl: Make sure WPTR reg is updated properly
Sometimes writes to WPTR register do not take effect, causing a
3D core hang. Make sure the WPTR is updated properly when waiting.
msm: kgsl: Set default value of wait_timeout in the adreno_dev struct
Set the initalization value of wait_timeout at compile time in the
declaration of the adreno_device struct instead of at runtime in
adreno_probe.
This function is supposed to return the memdesc that
contains the range gpuaddr to gpuaddr + size. One of the
lookups was using sizeof(unsigned int) instead of size,
which could cause false positive results from this function
and possibly kernel panics in the snapshot or postmortem
code, which rely on it to do bounds checking for them.
Some hangs are fooling the postmortem dump code into
running off the end of a buffer. Fix this by making
its bounds check logic work better by reusing the
logic from kgsl_find_region().