Pull drm updates from Dave Airlie:
"One of the smaller drm -next pulls in ages!
Ben (nouveau) has a rewrite in progress but we decided to leave it
stew for another cycle, so just some fixes from him.
- radeon: lots of documentation work, fixes, more ring and locking
changes, pcie gen2, more dp fixes.
- i915: haswell features, gpu reset fixes, /dev/agpgart removal on
machines that we never used it on, more VGA/HDP fix., more DP fixes
- drm core: cleanups from Daniel, sis 64-bit fixes, range allocator
colouring.
but yeah fairly quiet merge this time, probably because I missed half
of it!"
Trivial add-add conflict in include/linux/pci_regs.h
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (255 commits)
drm/nouveau: init vblank requests list
drm/nv50: extend vblank semaphore to generic dmaobj + offset pair
drm/nouveau: mark most of our ioctls as deprecated, move to compat layer
drm/nouveau: move current gpuobj code out of nouveau_object.c
drm/nouveau/gem: fix object reference leak in a failure path
drm/nv50: rename INVALID_QUERY_OR_TEXTURE error to INVALID_OPERATION
drm/nv84: decode PCRYPT errors
drm/nouveau: dcb table quirk for fdo#50830
nouveau: Fix alignment requirements on src and dst addresses
drm/i915: unbreak lastclose for failed driver init
drm/i915: Set the context before setting up regs for the context.
drm/i915: constify mode in crtc_mode_fixup
drm/i915/lvds: ditch ->prepare special case
drm/i915: dereferencing an error pointer
drm/i915: fix invalid reference handling of the default ctx obj
drm/i915: Add -EIO to the list of known errors for __wait_seqno
drm/i915: Flush the context object from the CPU caches upon switching
drm/radeon: fix dpms on/off on trinity/aruba v2
drm/radeon: on hotplug force link training to happen (v2)
drm/radeon: fix hotplug of DP to DVI|HDMI passive adapters (v2)
...
We should not hit this under any sane conditions, but still, this does not
looks right.
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
CC: stable@vger.kernel.org
Reported-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Reviewed-by: Chris Wlison <chris@chris-wilson.co.uk>
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We believe to have squashed all issues around the gen6+ rps interrupt
generation and why the gpu sometimes got stuck. With that cleared up,
there's no user left for the sanitize_pm infrastructure, so let's just
rip it out.
Note that 'intel_reg_write 0xa014 0x13070000' is the w/a if we find
ourselves stuck again.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The power docs say that when the gt leaves rc6, it is in the lowest
frequency and only about 25 usec later will switch to the frequency
selected in GEN6_RPNSWREQ. If the downclock limit expires in that
window and the down limit is set to the lowest possible frequency, the
hw will not send the down interrupt. Which leads to a too high gpu
clock and wasted power.
Chris Wilson already worked on this with
commit 7b9e0ae6da0a7eaf2680a1a788f08df123724f3b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sat Apr 28 08:56:39 2012 +0100
drm/i915: Always update RPS interrupts thresholds along with
frequency
but got the logic inverted: The current code set the down limit as
long as we haven't reached it. Instead of only once with reached the
lowest frequency.
Note that we can't always set the downclock limit to 0, because
otherwise the hw will keep on bugging us with downclock request irqs
once the lowest level is reached.
For similar reasons also always set the upclock limit, otherwise the
hw might poke us again with interrupts.
v2: Chris Wilson noticed that the limit reg is also computed in
sanitize_pm. To avoid duplication, extract the code into a common
function.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By selecting the cache level (essentially whether or not the CPU snoops
any updates to the bo, and on more recent machines whether it resides
inside the CPU's last-level-cache) a userspace driver is able to then
manage all of its memory within buffer objects, if it so desires. This
enables the userspace driver to accelerate uploads and more importantly
downloads from the GPU and to able to mix CPU and GPU rendering/activity
efficiently.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added code comment about where we plan to stuff platform
specific cacheing control bits in the ioctl struct.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Several functions of the GPU have the restriction that differing memory
domains cannot be placed next to each other (as the GPU may prefetch
beyond the end of one domain and hang as it crosses into the other
domain). We use the facility of the drm_mm to mark ranges with a
particular color that corresponds to the cache attributes of those pages
in order to prevent allocating adjacent blocks of differing memory
types.
v2: Rebase ontop of drm_mm coloring v2.
v3: Fix rebinding existing gtt_space and add a verification routine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drv_priv->gmbus is an array. Comparing it with NULL is somewhat less useful
than a chocolate teapot.
Possibly we should be testing bus != NULL each iteration of the loop
instead ?
gcc could help by warning too!
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise our initial behaviour is "randomly save a bogus PLL
choice" as far as I can see.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel writes: (this pull is the one with the bad patch dropped)
First pile of fixes for 3.6 already, and I'm afraid it's a bit larger than
what I'd wish for. But I've moved all the feature-y stuff to -next, so
this really is all -fixes. Most of it is handling fallout from the hw
context stuff, discovered now that mesa git has started using them for
real. Otherwise all just small fixes:
- unbreak modeset=0 on gen6+ (regressed in next)
- const mismatch fix for ->mode_fixup
- simplify overly clever lvds modeset code (current code can totally
confuse backlights, resulting in broken panels until a full power draw
restores them).
- fix some fallout from the flushing_list disabling (regression only
introduced in -next)
- DP link train improvements (this also kills the last 3.2 dp regression
afaik)
- bugfix for the new ddc VGA detection on newer platforms
- minor backlight fixes (one of them a -next regression)
- only enable the required PM interrupts (to avoid waking up the cpu
unnecessarily)
- some really minor bits (workaround clarification, make coverty happy,
hsw init fix)
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: (23 commits)
drm/i915: unbreak lastclose for failed driver init
drm/i915: Set the context before setting up regs for the context.
drm/i915: constify mode in crtc_mode_fixup
drm/i915/lvds: ditch ->prepare special case
drm/i915: dereferencing an error pointer
drm/i915: fix invalid reference handling of the default ctx obj
drm/i915: Add -EIO to the list of known errors for __wait_seqno
drm/i915: Flush the context object from the CPU caches upon switching
drm/i915: Make the lock for pageflips interruptible
drm/i915: don't forget the PCH backlight registers
drm/i915: Insert a flush between batches if the breadcrumb was dropped
drm/i915: missing error case in init status page
drm/i915: mask tiled bit when updating ILK sprites
drm/i915: try to train DP even harder
drm/i915: kill intel_ddc_probe
drm/i915: check whether we actually received an edid in detect_ddc
drm/i915: fix up PCH backlight #define mixup
drm/i915: Add comments to explain the BSD tail write workaround
drm/i915: Disable the BLT on pre-production SNB hardware
drm/i915: initialize power wells in modeset_init_hw
...
* 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/nouveau: init vblank requests list
drm/nv50: extend vblank semaphore to generic dmaobj + offset pair
drm/nouveau: mark most of our ioctls as deprecated, move to compat layer
drm/nouveau: move current gpuobj code out of nouveau_object.c
drm/nouveau/gem: fix object reference leak in a failure path
drm/nv50: rename INVALID_QUERY_OR_TEXTURE error to INVALID_OPERATION
drm/nv84: decode PCRYPT errors
drm/nouveau: dcb table quirk for fdo#50830
nouveau: Fix alignment requirements on src and dst addresses
Fixes kernel panic when vblank interrupt triggers before first sync to
vblank request.
(Besides init, remove some relevant leftovers from vblank rework)
Reported-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: stable@vger.kernel.org [3.5]
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
These will be replaced in the near future, the code isn't yet stable enough
for this merge window however.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Current name is misleading, because this error can be triggered by other
conditions, like changing STRMOUT parameter without disabling STRMOUT first.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Linear copy works by adding the offset to the buffer address,
which may end up not being 16-byte aligned.
Some tests I've written for prime_pcopy show that the engine
allows this correctly, so the restriction on lowest 4 bits of
address can be lifted safely.
The comments added were by envyas, I think because I used
a newer version.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: stable@vger.kernel.org
Originally I had a macro specifically for DPF support, and Daniel, with
good reason asked me to change it to this. It's not the way I would have
gone (and indeed I didn't), but for now there is no distinction as all
platforms with L3 also have DPF.
Note: The good reasons are that dpf is a l3$ feature (at least on
currrent hw), hence I don't expect one to go without the other.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: added note]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Basic context support on HSW is no different than previous generations.
The size of the context object changes, but that's about it.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As suggested by Daniel, rip out the independent timers for device and
crtc busyness and integrate the manual powermanagement of the display
engine into the GEM core and its request tracking. The benefits are that
the code is a lot smaller, fewer moving parts and should fit more neatly
into the overall activity tracking of the driver.
v2: Complete overhaul and removal of the racy timers and workers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By moving the function to intel_ringbuffer and currying the appropriate
parameter, hopefully we make the callsites easier to read and
understand.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise once we use the buffer with a BLT command on gen2/3, we will
always regard future command submissions as continuing the fenced
access. However, now that we flush/invalidate between every batch we can
drop this pessimism.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we unconditionally flush and invalidate between every batch
buffer, we no longer need the complex logic to decide which domains
require flushing. Remove it and rejoice.
v2 (danvet): Keep around the flip waiting logic. It's gross and
broken, I know, but we can't just kill that thing ... even if we just
keep it around as a reminder that things are broken.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rely instead on the insertion of the implicit flush before the seqno
breadcrumb.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As the flush is either performed explictly immediately after the
execbuffer dispatch, or before the serialisation of last_fenced_seqno we
can forgo the explict i915_gem_flush_ring().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is now handled by a global flag to ensure we emit a flush before
the next serialisation point (if we failed to queue one previously).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we guarantee to emit a flush before emitting the breadcrumb or
the next batchbuffer, there is no further need for the flushing list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we always flush the GPU cache prior to emitting the breadcrumb, we no
longer have to worry about the deferred flush causing the
pending_gpu_write to be delayed. So we can instead utilize the known
last_write_seqno to hopefully minimise the wait times.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we move to lazily clearing the GPU write domain only when the buffer
becomes inactive, this leaves a window of opportunity for
i915_gem_object_pin_to_display_plane() to detect a seemingly
inconsistent value. This function is special as it tries to pipeline the
operation to avoid the stall and so may not retires the buffer and we
may not get the opportunity to clear the write domain. However, we know
all is good, so drop the assertion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Request preallocation was added to i915_add_request() in order to
support the overlay. However, not all users care and can quite happily
ignore the failure to allocate the request as they will simply repeat
the request in the future.
By pushing the allocation down into i915_add_request(), we can then
remove some rather ugly error handling in the callers.
v2: Nullify request->file_priv otherwise we chase a garbage pointer
when retiring requests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the base addresses shifting around, this is easier to handle.
Also move to the real reg offset on vlv.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The intention is to help select which engine to use for copies with
interoperating clients - such as a GL client making a request to the X
server to perform a SwapBuffers, which may require copying from the
active GL back buffer to the X front buffer.
We choose to report a mask of the active rings to future proof the
interface against any changes which may allow for the object to reside
upon multiple rings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: bikeshed away the write ring mask and add the explanation
Chris sent in a follow-up mail why we decided to use masks.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The interface's immediate purpose is to do synchronous timestamp queries
as required by GL_TIMESTAMP. The GPU has a register for reading the
timestamp but because that would normally require root access through
libpciaccess, the IOCTL can provide this service instead.
Currently the implementation whitelists only the render ring timestamp
register, because that is the only thing we need to expose at this time.
v2: make size implicit based on the register offset
Add a generation check
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: fixup the ioctl numerb:]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds support for the ns2501 DVO, found in some older Fujitsu/Siemens Labtops.
It is in the state of "works for me".
Includes now proper DPMS support. Includes switching between resolutions -
from 640x480 to 1024x768.
Currently assumes that the native display resolution is 1024x768.
The ns2501 seems to be rather critical - if the output PLL is not
running, the chip doesn't seem to be clocked and then doesn't react
on i2c messages. Thus, a quick'n-dirty trick ensures that the DVO
is active before submitting any i2c messages to it. This is
probably to be reviewed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17902
Signed-off-by: Thomas Richter <thor@math.tu-berlin.de>
[danvet: fixup whitespace fail.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will be needed for Haswell, but already has its uses here.
This patch started as a small patch written patch by Shobhit Kumar,
but it has changed so much that none of its original lines remain.
Credits-to: Shobhit Kumar <shobhit.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have some common code that we always run before calling
intel_dp_set_link_train. This common code sets the correct training
patterns to the DP variable. If we add more calls to
intel_dp_set_link_train, we'll also have to duplicate this common
code. So instead of repeating this code whenever we call
intel_dp_set_link_train, we move the code to inside the function: now
we check which training pattern we're going to set and then we set the
DP register according to it.
One of the side-effects of this change is that now we never forget to
mask the training pattern bits before changing them. It looks like
this was working before because we were first masking the bits, then
writing 00, 01 and then 11.
This patch also enables us to use the intel_dp_set_link_train function
when disabling link training: in this case we need to avoid writing
the DP_TRAINING_LANE*_SET AUX commands.
As a bonus, the big intel_dp_{start,complete}_link_train functions
will get smaller and a little bit easier to read.
Version 2 changes:
- Rewrite commit message.
- Also clear the training pattern bits before changing them.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of having a giant if cascade to figure this out according to
the passed-in register. We could do quite a bit more cleaning up and
all by using the port at more places, but I think this should be part
of a bigger rework to introduce a struct intel_digital_port which
would keep track of all these things. I guess this will be part of
some haswell-DP-induced refactoring.
For now this rips out the big cascade, which is what annoyed me so
much.
v2: Add port variable name back for the func decl (I've tried to trick
myself below the 80 char limit).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Intel hw only has one MUX for encoders, so outputs are either not
cloneable or all in the same group of cloneable outputs. This neatly
simplifies the code and allows us to ditch some ugly if cascades in
the dp and hdmi init code (well, we need these if cascades for other
stuff still, but that can be taken care of in follow-up patches).
Note that this changes two things:
- dvo can now be cloned with sdvo, but dvo is gen2 whereas sdvo is
gen3+, so no problem. Note that the old code had a bug and didn't
allow cloning crt with dvo (but only the other way round).
- sdvo-lvds can now be cloned with sdvo-non-tv. Spec says this won't
work, but the only reason I've found is that you can't use the
panel-fitter (used for lvds upscaling) with anything else. But we
don't use the panel fitter for sdvo-lvds. Imo this part of Bspec is
a) rather confusing b) mostly as a guideline to implementors (i.e.
explicitly stating what is already implicit from the spec, without
always going into the details of why). So I think we can ignore this
- worst case we'll get a bug report from a user with with sdvo-lvds
and sdvo-tmds and have to add that special case back in.
Because sdvo lvds is a bit special explain in comments why sdvo LVDS
outputs can be cloned, but native LVDS and eDP can't be cloned - we
use the panel fitter for the later, but not for sdvo.
Note that this also uncoditionally initializes the panel_vdd work used
by eDP. Trying to be clever doesn't buy us anything (but strange bugs)
and this way we can kill the is_edp check.
v2: Incorporate review from Paulo
- Add in a missing space.
- Pimp comment message to address his concerns.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Splitting them up between pch and gmch variants just makes it harder
to find things. Especially since the hotplug bits are actually valid
on earlier chips, too.
v2: Fixed the comment as pointed out by Paulo Zanoni.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When bug hunting, I found the interface to do_switch() overly
complicated and I believe festered the earlier bug. This aims to make
the code a little clearer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move the DP structure to shared location so that it can be used from
within the ddi module.
Changes from Paulo:
- Move less code to intel_drv.h
- Remove #include statement
- Replace a tab with a space in train_set
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We now refuse to load on gen6+ if kms is not enabled:
commit 26394d9251879231b85e6c8cf899fa43e75c68f1
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Mar 26 21:33:18 2012 +0200
drm/i915: refuse to load on gen6+ without kms
Which results in the drm core calling our lastclose function to clean
up the mess, but that one is neatly broken for such failure cases
since kms has been introduced in
commit 79e539453b34e35f39299a899d263b0a1f1670bd
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Fri Nov 7 14:24:08 2008 -0800
DRM: i915: add mode setting support
Reported-and-tested-by: Paulo Zanoni <przanoni@gmail.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes failures in transform feedback on gen7 because our SOL_RESET
flag was setting the transform feedback offsets in the old context
(occasionally happened to be ours) instead of the new context.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
LVDS is the first output where dpms on/off and prepare/commit don't
perfectly match. Now the idea behind this special case seems to be
that for simple resolution changes on the LVDS we don't need to stop
the pipe, because (at least on newer chips) we can adjust the panel
fitter on the fly.
There are a few problems with the current code though:
- We still stop and restart the pipe unconditionally, because the crtc
helper code isn't flexible enough.
- We show some ugly flickering, especially when changing crtcs (this
the crtc helper would actually take into account, but we don't
implement the encoder->get_crtc callback required to make this work
properly).
So it doesn't even work as advertised. I agree that it would be nice
to do resolution changes on LVDS (and also eDP) whithout blacking the
screen where the panel fitter allows to do that. But imo we should
implement this as a special case a few layers up in the mode set code,
akin to how we already detect simple framebuffer changes (and only
update the required registers with ->mode_set_base).
Until this is all in place, make our lives easier and just rip it out.
Also note that this seems to fix actual bugs with enabling the lvds
output, see:
http://lists.freedesktop.org/archives/intel-gfx/2012-July/018614.html
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Giacomo Comes <comes@naic.edu>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>