Commit Graph

2933 Commits

Author SHA1 Message Date
Michael Niedermayer 9ac124c889 Merge commit '206895708ea2b464755d340e44501daf9a07c310'
* commit '206895708ea2b464755d340e44501daf9a07c310':
  x86inc: Remove our FMA4 support

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 15:54:23 +02:00
Michael Niedermayer 12e4493f9c Merge commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098'
* commit 'c108ba0175d4fc3a3253a8b0f782fbfb96ba5098':
  x86inc: Use VEX-encoded instructions in AVX functions

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-14 15:48:34 +02:00
Jason Garrett-Glaser a3fabc6cb3 x86: more AVX2 framework
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:41:56 +01:00
Jason Garrett-Glaser c6908d6b4b x86inc: FMA3/4 Support
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:41:54 +01:00
Derek Buitenhuis 206895708e x86inc: Remove our FMA4 support
This is so we can sync to x264's version of FMA4 support.

This partialy reverts commit 79687079a9.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:39:29 +01:00
Henrik Gramner c108ba0175 x86inc: Use VEX-encoded instructions in AVX functions
Automatically use VEX-encoding in AVX/AVX2/XOP/FMA3/FMA4
functions for all instructions that exists in a VEX-encoded
version.

This change makes it easier to extend existing code to use AVX2.

Also add support for AVX emulation of a few instructions that
were missing before.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-14 12:36:11 +01:00
Michael Niedermayer 31d0d35560 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  x86inc: Remove .rodata kludges

Conflicts:
	libavutil/x86/x86inc.asm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-09 14:29:42 +02:00
Henrik Gramner ad7d7d4f6a x86inc: Remove .rodata kludges
The Mach-O bug was fixed in yasm 0.8.0 and we don't
support versions that old anymore.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-09 07:44:30 -04:00
Michael Niedermayer 19c3890819 Merge commit '3e2fa991db7ef172579422accd61624d52777e5a'
* commit '3e2fa991db7ef172579422accd61624d52777e5a':
  x86inc: remove misaligned cpu flag

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 12:02:21 +02:00
Michael Niedermayer 31d9aa6b2e Merge commit '71155665414b551ad350622d5abed20e58371fbf'
* commit '71155665414b551ad350622d5abed20e58371fbf':
  x86inc: various minor backports from x264

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:57:39 +02:00
Michael Niedermayer 3f965ab95d Merge commit '47f9d7ce5493e119e09d1227d017414feaaf8d97'
* commit '47f9d7ce5493e119e09d1227d017414feaaf8d97':
  x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:37:22 +02:00
Michael Niedermayer 1f17619fe4 Merge commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450'
* commit 'bbe4a6db44f0b55b424a5cc9d3e89cd88e250450':
  x86inc: Utilize the shadow space on 64-bit Windows

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:23:00 +02:00
Michael Niedermayer 17d9c7c208 Merge commit '3fb78e99a04d0ed8db834d813d933eb86c37142a'
* commit '3fb78e99a04d0ed8db834d813d933eb86c37142a':
  x86inc: create xm# and ym#, analagous to m#

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:15:17 +02:00
Michael Niedermayer 3352fdb292 Merge commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2'
* commit '49ebe3f9fe02174ae7e14548001fd146ed375cc2':
  x86inc: fix some corner cases of SWAP

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:07:03 +02:00
Michael Niedermayer 006c0fcfea Merge commit '63f0d623100bdb0c6081456127f4b6713e83d3db'
* commit '63f0d623100bdb0c6081456127f4b6713e83d3db':
  x86inc: Use SSE instead of SSE2 for copying data

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 11:01:40 +02:00
Michael Niedermayer faafffaf82 Merge commit 'ad76e6e7e193b98e7335156422d35467816f9ef1'
* commit 'ad76e6e7e193b98e7335156422d35467816f9ef1':
  x86inc: Set ELF hidden visibility for global constants

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:52:51 +02:00
Michael Niedermayer c1488fab3d Merge commit '25cb0c1a1e66edacc1667acf6818f524c0997f10'
* commit '25cb0c1a1e66edacc1667acf6818f524c0997f10':
  x86inc: activate REP_RET automatically

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-08 10:27:30 +02:00
Henrik Gramner 3e2fa991db x86inc: remove misaligned cpu flag
Prevents a crash if the misaligned exception mask bit is
cleared for some reason.

Misaligned SSE functions are only used on AMD Phenom CPUs
and the benefit is miniscule. They also require modifying
the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:38 -04:00
Jason Garrett-Glaser 7115566541 x86inc: various minor backports from x264
Small backports that sneaked into other asm commits in x264.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:22 -04:00
Derek Buitenhuis 47f9d7ce54 x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"
This is also a valid value for WIN64.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:27:08 -04:00
Henrik Gramner bbe4a6db44 x86inc: Utilize the shadow space on 64-bit Windows
Store XMM6 and XMM7 in the shadow space in functions that
clobbers them. This way we don't have to adjust the stack
pointer as often, reducing the number of instructions as
well as code size.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:35 -04:00
Loren Merritt 3fb78e99a0 x86inc: create xm# and ym#, analagous to m#
For when we want to mix simd sizes within one function.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:19 -04:00
Loren Merritt 49ebe3f9fe x86inc: fix some corner cases of SWAP
SWAP with >=3 named (rather than numbered) args
PERMUTE followed by SWAP with 2 named args
used to produce the wrong permutation

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:25:06 -04:00
Henrik Gramner 63f0d62310 x86inc: Use SSE instead of SSE2 for copying data
Reduces code size because movaps/movups is one byte
shorter than movdqa/movdqu.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:33 -04:00
Henrik Gramner ad76e6e7e1 x86inc: Set ELF hidden visibility for global constants
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:24:13 -04:00
Loren Merritt 25cb0c1a1e x86inc: activate REP_RET automatically
Now RET checks whether it immediately follows a branch, so the
programmer dosen't have to keep track of that condition. REP_RET
is still needed manually when it's a branch target, but that's
much rarer.

The implementation involves lots of spurious labels, but that's OK
because we strip them.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
2013-10-07 06:17:59 -04:00
Niv Sardi 49ba6e56bd lavu/parseutils: add more resolutions
See http://en.wikipedia.org/wiki/Graphics_display_resolution

Signed-off-by: Niv Sardi <xaiki@evilgiggle.com>
Signed-off-by: Stefano Sabatini <stefasab@gmail.com>
2013-10-07 11:45:03 +02:00
Michael Niedermayer 90cecd3c9b Merge commit '4272bb6ef1533846a788c259cc498562d0704444'
* commit '4272bb6ef1533846a788c259cc498562d0704444':
  doxy: Document avlog

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-06 11:16:15 +02:00
Luca Barbato 4272bb6ef1 doxy: Document avlog
Provide some information for every function and add a group.
2013-10-05 18:09:45 +02:00
Stefano Sabatini 515e651f56 lavu/opt: fix doxy for av_opt_get* functions about return value
Success code must be >= 0 and not == 0, consistently with the
implementation.
2013-10-04 16:36:27 +02:00
Stefano Sabatini 719b4eef5d lavu/common: add warning to GET_UTF8 doxy
Should prevent wrong uses, or at least decrease their chance.
2013-10-04 16:36:27 +02:00
Michael Niedermayer 1e67800780 Merge commit '80fefbed623491b92fe59ead99225f99c0d0ca08'
* commit '80fefbed623491b92fe59ead99225f99c0d0ca08':
  x86: cpu: Restore some explanatory comments removed in 7160bb7

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-04 11:28:21 +02:00
Diego Biurrun 80fefbed62 x86: cpu: Restore some explanatory comments removed in 7160bb7 2013-10-03 23:00:09 +02:00
Diego Biurrun 5ce04c14dd Use correct Doxygen syntax 2013-10-03 17:53:51 +02:00
Michael Niedermayer 2ece7d94bc Merge commit '5ce04c14dd3dd3670cbdba82275a3a72c716ec6f'
* commit '5ce04c14dd3dd3670cbdba82275a3a72c716ec6f':
  Use correct Doxygen syntax

Conflicts:
	libavcodec/atrac3.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-10-03 21:29:38 +02:00
Ronald S. Bultje c07ac8d467 VP9 MC (ssse3) optimizations.
Decoding time of ped1080p.webm goes from 20.7sec to 11.3sec.
2013-10-02 21:03:15 -04:00
Michael Niedermayer 945c7e399a Merge commit '38e15df1489d86c016515223ee693e7d0326c56a'
* commit '38e15df1489d86c016515223ee693e7d0326c56a':
  avframe: note that linesize is not the usable data size

Conflicts:
	libavutil/frame.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-28 10:47:43 +02:00
Anton Khirnov 38e15df148 avframe: note that linesize is not the usable data size 2013-09-28 07:35:50 +02:00
Michael Niedermayer a454dec19a pixdesc: fix NV20* descriptors
They were inconsistent (overlapping fields and wrong sizes)

Signed-off-by: Anton Khirnov <anton@khirnov.net>
2013-09-24 14:02:10 +02:00
Michael Niedermayer 361bc70731 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avutil: Fix compilation with inline asm disabled on mingw

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 11:51:38 +02:00
Michael Niedermayer 85f8a3cb50 Merge commit 'e208e6d209728d332343aa5390ae377ac0a6305c'
* commit 'e208e6d209728d332343aa5390ae377ac0a6305c':
  lavu: Add interleaved 4:2:2 8/10-bit formats

Conflicts:
	doc/APIchanges
	libavutil/pixdesc.c
	libavutil/pixfmt.h
	libavutil/version.h

See: 90ca5a9b5f
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 11:42:03 +02:00
Michael Niedermayer 8310bccc91 avutil/pixdesc: try to fix NV20* descriptors
They where inconsistent (overlapping fields and wrong sizes)

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 11:34:01 +02:00
Kieran Kunhya 90ca5a9b5f Add interleaved 4:2:2 8/10-bit formats
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-22 10:49:29 +02:00
Alex Smith 08fa828b3f avutil: Fix compilation with inline asm disabled on mingw
Because of -Werror=implicit-function-declaration the build will fail.

Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-22 00:50:32 +03:00
Kieran Kunhya e208e6d209 lavu: Add interleaved 4:2:2 8/10-bit formats
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
2013-09-21 17:36:15 +02:00
Alex Smith 66c2f200b6 lavu/attributes: Don't define av_restrict
This is always defined in config.h.

Original patch by Derek Buitenhuis.
2013-09-21 15:43:31 +02:00
Michael Niedermayer 0506f3fa38 avutil/cpu: remove duplicate include
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-21 15:36:33 +02:00
Michael Niedermayer 6c169c2fa4 Merge commit '67e285ceca1cb602a5ab87010b30d904527924fe'
* commit '67e285ceca1cb602a5ab87010b30d904527924fe':
  mem: Handle av_reallocp(..., 0) properly

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-21 09:36:58 +02:00
Martin Storsjö 67e285ceca mem: Handle av_reallocp(..., 0) properly
Previously this did a double free (and returned an error).

Reported-by: Justin Ruggles
Signed-off-by: Martin Storsjö <martin@martin.st>
2013-09-20 21:23:08 +03:00
Michael Niedermayer 8377b2b6bf Merge commit '33b88f2a4ae54d5397c45e39a5326289ebdc7747'
* commit '33b88f2a4ae54d5397c45e39a5326289ebdc7747':
  msvc/icl: Use __declspec(noinline)

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2013-09-20 16:03:00 +02:00