最新评论
Car paint + physical sky + architectural material
[img]http://pic002.cnblogs.com/images/2012/6204/2012020123221751.jpg[/img]
[img]http://pic002.cnblogs.com/images/2012/6204/2012020113343218.jpg[/img]
Car paint material test.
Re:[Maxim07]中光线与三角形求交算法的推导 Hellraider 2011-12-18 10:52
07年还能有三角形求交的文章...真有人孜孜不倦地做啊
Re:终于想出了区分R&B和Hip-hop的比较通俗的说法 Librazy@沙县五中 2011-11-28 18:18
到了现在,全都是流行音乐
http://nguyendangbinh.org/Proceedings/Eurographics/2005/dl/conf/award/eurographics_86pp145-152.pdf
http://www.mentallandscape.com/Papers_siggraph90tutorial.pdf
[img]http://pic002.cnblogs.com/images/2011/6204/2011111413085089.jpg[/img]
GI with dragon.
[img]http://pic002.cnblogs.com/images/2011/6204/2011111210123191.jpg[/img]
hair rendering test with GI.
QMC random walk
http://www.ima.umn.edu/preprints/MARCH1993/1110.pdf
http://www.spot3d.com/vray/help/150SP1/render_params_dmc.htm
p(t)=(p0-2p1+p2)t^2+2(p1-p0)t+p0
Re:Network Programming Len3d 2011-04-11 15:05
Utilizing UDP multicast to broadcast messages is far more efficient thant TCP/IP?
@TG8964
Because C is considered to be more portable and efficient than C++.
why re-invent wheels? does it make any difference than using the subset of c++ as pure c language plus stl container support?
to make bsp faster:
1. smaller average size
2. smaller average depth
3. smaller max size
4. more empty leaves
RW-lock:
non final gather + single thread: 9.66 second
final gather + single thread: 53.73 second
non final gather + 8 threads: 2.45 second
final gather + 8 threads: 8.22 second
TLS-mutex:
non final gather + single thread: 9.87 second
final gather + single thread: 51.59 second
non final gather + 8 threads: 2.04 second
final gather + 8 threads: 8.03 second
Re:开始写3D室内游戏引擎 3D游戏引擎网 2010-09-20 12:56
看了博主的文章受益匪浅,如果你能大一个更大的平台分享关于游戏引擎相关的东西就
[url=http://www.gamengines.com]3D游戏引擎网[/url] [url=www.gamengines.com ]www.gamengines.com [/url]诚意邀请你加入。谢谢。
1. large-BSP mode can really handle larger scenes in elvish ray.
2. we can use in-memory recomputation rather than disk caching for regular-bsp to handle larger scenes.
3. we should conserve more memory in a more tight manner for large-bsp.
glossy reflection is enabled, at least it does not crash...
Start rendering...
Running process...
Approximating time: 0 hours 0 minutes 0.004000 seconds.
Running process...
BSP construction time: 0 hours 0 minutes 0.728998 seconds.
Final gathering time: 0 hours 0 minutes 0.000000 seconds.
Running process...
Cleaning process...
Trace time: 0 hours 8 minutes 17.776001 seconds.
Outputing images...
Average BSP Size: 1.549286
Average BSP Depth: 19.152911
Max BSP Size: 7
Max BSP Depth: 27
Num BSP Leaves: 53626
Num BSP Empty Leaves: 6690
Num BSP Nodes: 53625
Num BSP Allocations: 11
Num BSP Extra Bounds: 0
Num BSP Invalid Splits: 0
Num BSP Bad Splits: 26914
BSP Memory Used: 0 MB
GPIT Memory Used: 0 MB
Num GPIT Allocations: 0
Num File Rebuilds: 0
Num Cannot Allocate Bounds: 0
Num Objects: 1
Num Object Instances: 4800
Num Source Primitives: 4761600
Num Tessellated Primitives: 0
Num Temp Allocations: 0
Num Temp Huge Allocations: 0
Num REYES Rays: 0
Num Eye Rays: 2058890
Num Shadow Rays: 4434651
Num Secondary Rays: 7556588
Num Finalgather Rays: 0
Num Probe Rays: 0
Num Photon Rays: 0
Num Asserts: 0
Num Caustic Photons: 0
Num Globillum Photons: 0
Num Stored Caustic Photons: 0
Num Stored Globillum Photons: 0
Num Shot Caustic Photons: 0
Num Shot Globillum Photons: 0
Total Subtree Size: 831.737305 MB
Average Subtree Size: 20.131873 KB
Max Subtree Size: 401.703125 KB
Num Heap Allocations: 0
Database Cache Hit Rate: 0.000000 %
Database Page File Size: 839.638969 MB
Database Compression Rate: 0.000000 %
Database Memory Peak: 512.347176 MB
Database Virtual Memory Peak: 841.355034 MB
Database Data Reads: 0
Database Data Writes: 0
Cache [0] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [1] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [2] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [3] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [4] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [5] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [6] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Cache [7] Memory Limit: 64 MB
-- Memory Size: 0 MB
-- Hit Rate: 0.000000 %
-- Purge Fails(Should never happen): 0
-- Alloc Fails(Should never happen): 0
-- Num Slots: 0
-- Rebuild Time: 0 : 0 : 0.000000
Global BSP average size: 5.417524
Global BSP average depth: 27.296506
Global BSP max size: 105
Global BSP max depth: 18
Global BSP leaves: 2807081
Global BSP empty leaves: 1194934
Global BSP nodes: 2764775
Global BSP bad splits: 204486
Completed rendering.
Elapsed time : 0 hours 8 minutes 18.580017 seconds.
the worst of BSP construction is still:
RC_BSP.h, Ln 663, e_BSPBuilder::classify uses 0 : 0 : 7.876179, count: 62065124, max dur.: 0.001007
RC_BSP.h, Ln 510, e_BSPBuilder::FindPlane, sort uses 0 : 0 : 6.019238, count: 1977713, max dur.: 0.001007
network related stuff:
While
an item is pinned, the DB module will not destroy or move the item to make space for other
items coming in from the network. If the pointer is used to change the item (which is
usually the case after creating it), the item must be flushed explicitly after the change
is complete. Flushing means that all other hosts on the network who have a copy of the
item are notified to delete the item from their local memory, or re-read it if it was
pinned locally.
Returns the type that was specified when the item was created. This is done when an
item needs to be byte-swapped; every type is swapped in a different way.
After finishing with the reallocated item, the caller must flush the
item to inform other hosts of the change.
The item is deleted on the local
host and on all other hosts. The caller should make sure that no other host still has
this item pinned;
Remove the item from local memory if it is a cached copy that is owned by another host.
In this case, the next ei_db_access call is forced to re-read it over the network.
This is intended to reduce flushing overhead: if an entire subtree has changed, the
module that did the change could notify all hosts that the subtree has changed, rather
than doing a ei_db_flush for every item in the subtree. The notified hosts would delete
all their local copies of items in the subtree without further network traffic.
Deleting locally implies unpinning.
Notify all other hosts that have a copy of the item to invalidate or re-read their
copy. Flushing is necessary after the caller finished writing to an entry. If the
item was just created using ei_db_create, it is not necessary to flush it because no
other host can possibly have a copy, because the tag is only known to the creating
routine. If tag is 0, no item is flushed, but all deferred flushes (if any) are
propagated to other CPUs. Note that flushes of items created on another host are less
efficient than flushes of items created locally.
This call is equivalent to ei_db_flush, except that the tag may be buffered locally
and sent as a block if either the buffer fills, or a regular ei_db_flush is called.
If the last pin is removed, the DB module is free to re-use the memory,
provided that it is not the owner of the item (i.e., this is not the host where the
item was created). This last restriction ensures that at least one copy of the item
remains on the net of DBs.
Extract the VPU from a tag. The returned VPU identifies the host where the tag was
created initially, and where it still resides in local database memory. This call
can be used to distribute tasks preferentially to hosts that already have the
database items (eg., boxes) required by the task.
arnold render can render 245 billion polygons, don't know how they did this.
Re:atomic flushing data eygneph 2010-05-04 22:47
@Len3d
不归路啊不归路~
Re:atomic flushing data Len3d 2010-05-04 14:57
@eygneph
理论上是可以搞定,但是需要一个特殊的CAS操作,操作数长度是当前平台上指针的两倍,比如在32bit平台上就需要64bit的CAS,在64bit平台上就需要128bit的CAS,所以我暂时还是搞非lockfree的……
Re:atomic flushing data eygneph 2010-05-04 00:29
开始搞lockfree了?
1. the problem about e_Array is that one array always keeps opening a small chunk of data, thus this chunk of data could never be flushed, and when we have many such arrays, the usable memory would become less, so we cannot handle very large scene. to deal with this limitation, while still maintaining fast access to sequential data items, we should provide a mode like this:
array->begin_access();
for index
array->access(index);
array->end_access();
meantime, we provide the random access function:
array->get(index);
2. lots of the pre-process time is spent in ZLIB, since ZLIB is too slow for our data swapping, we should remove it from our source code.
mental ray always use one-half less eye rays and shadow rays, this is why mental ray is always twice faster, if eye rays and shadow rays are almost the same, the rendering time is almost the same.
We will get better performance if we can further reduce the factor access_accel_triangles.
If we can remove the * 3 above, we will get:
14609636 data reads < 18580585 data reads of RC_fastray. --Done.
Most of the data reads are consumed in intersect function:
Database Data Reads: 30720522 27614086
intersect : 3120808 * 3 = 9362424
intersect : 3381417 * 3 = 10144251
accel triangles : 8107411 * 1 = 8107411
The new RC_RAY engine performs more reads than RC_fastray engine, but less writes:
RC_fastray:
Database Data Reads: 18580585
Database Data Writes: 431047
RC_RAY:
Database Data Reads: 30720527
Database Data Writes: 155576
current rays per sec of RC_RAY(the new implemented ray-tracer) is 114487.89, which is very very slow compared to previous implementation.
monitor ProducerConsumer {
int itemCount
condition full
condition empty
procedure add(item) {
while (itemCount == BUFFER_SIZE) {
wait(full)
}
putItemIntoBuffer(item)
itemCount = itemCount + 1
if (itemCount == 1) {
notify(empty)
}
}
procedure remove() {
while (itemCount == 0) {
wait(empty)
}
item = removeItemFromBuffer()
itemCount = itemCount - 1
if (itemCount == BUFFER_SIZE - 1) {
notify(full)
}
return item;
}
}
procedure producer() {
while (true) {
item = produceItem()
ProducerConsumer.add(item)
}
}
procedure consumer() {
while (true) {
item = ProducerConsumer.remove()
consumeItem()
}
}
Re:Network Programming Len3d 2010-03-23 09:41
如果const位于*的左侧,则const就是用来修饰指针所指向的变量,即指针指向为常量;
如果const位于*的右侧,const就是修饰指针本身,即指针本身是常量。
Re:Network Programming Len3d 2010-03-23 09:37
(1)指针本身是常量不可变
(char*) const pContent;
const (char*) pContent;
(2)指针所指向的内容是常量不可变
const (char) *pContent;
(char) const *pContent;
(3)两者都不可变
const char* const pContent;
..\..\core\RC_fastray.cpp, Ln 1466, e_FRayEngine::FindPlane uses 0 hours 0 minutes 3.648033 seconds, count: 457232.
..\..\core\RC_fastray.cpp, Ln 1803, e_FRayEngine::classify uses 0 hours 0 minutes 13.344879 seconds, count: 27703813.
..\..\core\RC_fastray.cpp, Ln 1607, e_FRayEngine::create_leaf uses 0 hours 0 minutes 5.054024 seconds, count: 433089.
..\..\core\RC_fastray.cpp, Ln 1331, e_FRayEngine::FindPlaneFast uses 0 hours 0 minutes 2.861984 seconds, count: 143.
..\..\core\RC_fastray.cpp, Ln 2259, e_FRayEngine::build_tree, merge events uses 0 hours 0 minutes 8.690048 seconds, count: 432945.
In the plane, the Delaunay triangulation maximizes the minimum angle. Compared to any other triangulation of the points, the smallest angle in the Delaunay triangulation is at least as large as the smallest angle in any other. However, the Delaunay triangulation does not necessarily minimize the maximum angle.
Re:开始学习GLSL Len3d 2010-02-26 17:08
There are also many built-in function which can (and should) be used:
dot a simple dot product
cross a simple cross product
texture2D used for sampling a texture
normalize normalize a vector
clamp clamping a vector to a minimum and a maximum
For a full list of built-in functions see reference [2], page 46.
Each shader must have a main() void. This void is called if the shader is executed.
Re:开始学习GLSL Len3d 2010-02-26 17:07
Vector multiplication is component-wise:
Vector with matrix multiplication is also available.
Matrix * Vector will threat the vector as a column-vector (OpenGL standard)
Vector * Matrix will threat the vector as a row-vector (DirectX standard)
Re:开始学习GLSL Len3d 2010-02-26 17:06
GLSL is 100% type safe. You are not allowed to assign an integer to a float without casting (by constructor):
Re:开始学习GLSL Len3d 2010-02-26 17:04
You are also able to specify your own attributes, uniforms and varyings. For example if you want to pass a 3D tangent vector for each vertex from your application to the vertex shader you can specify a “Tangent” attribute:
attribute vec3 Tangent;
Here are some other examples:
uniform sampler2D my_color_texture;
uniform mat4 my_texture_matrix;
varying vec3 vertex_to_light_vector;
varying vec3 vertex_to_eye_vector;
attribute vec3 tangent;
attribute vec3 binormal;
Re:开始学习GLSL Len3d 2010-02-26 17:01
GLSL has some built-in attributes in a vertex shader:
gl_Vertex 4D vector representing the vertex position
gl_Normal 3D vector representing the vertex normal
gl_Color 4D vector representing the vertex color
gl_MultiTexCoordX 4D vector representing the texture coordinate of texture unit X
There are some other built-in attributes, see reference [2], page 41 for a full list.
GLSL also has some built-in uniforms:
gl_ModelViewMatrix 4x4 Matrix representing the model-view matrix.
gl_ModelViewProjectionMatrix 4x4 Matrix representing the model-view-projection matrix.
gl_NormalMatrix 3x3 Matrix representing the inverse transpose model-view matrix.
This matrix is used for normal transformation.
There are some other built-in uniforms, like lighting states. See reference [2], page 42 for a full list.
GLSL Built-In Varyings:
gl_FrontColor 4D vector representing the primitives front color
gl_BackColor 4D vector representing the primitives back color
gl_TexCoord[X] 4D vector representing the Xth texture coordinate
There are some other built-in varyings. See reference [2], page 44 for a full list.
And last but not least there are some built-in types which are used for shader output:
gl_Position 4D vector representing the final processed vertex position. Only
available in vertex shader.
gl_FragColor 4D vector representing the final color which is written in the frame
buffer. Only available in fragment shader.
gl_FragDepth float representing the depth which is written in the depth buffer.
Only available in fragment shader.
The importance of built-in types is that they are mapped to the OpenGL states. For example if you call glLightfv(GL_LIGHT0, GL_POSITION, my_light_position) this value is available as a uniform using gl_LightSource[0].position in a vertex and/or fragment shader.
1. most of the bsp building time is wasted in memory allocations of small objects.
2. half of the rendering time is wasted in read-lock/read-unlock.
the above 2 problems can be "hopefully" resolved by our new ray-tracing engine since it:
1. allocate memory in larger granularity.(which means allocate less times)
2. it locks/unlocks less.
The algorithms should be:
1. build initial bsp
that is, a kd-tree of tessel instances.
2. now we have a kd-tree whose leaves each contains a list of tessel instances, and the depth for each leaf is known in advance.
3. in multi-threaded rendering process, we build sub kd-trees dynamically, so the construction is automatically multi-threaded.
3.1. for regular bsp
we collect the indices of triangles from the list of tessel instances under each leaf, cull triangles which are not in this leaf, then build sub kd-tree for this leaf based on the uniform list of indices.
3.2. for large bsp
we build local kd-trees for all referenced tessels, they are not combined into a uniform one.
3.3. for bsp2
we ignore the bsp depth/bsp size parameters and determine these values on the fly.
about slow blank scanning:
1. job execution itself takes 1 sec; --Because time_wait for empty job queue.
2. building filter table takes 1 sec(it really has a very bad data structure); --Done.
3. writing/reading pixels into/from frame buffers(disk I/O) take 1 sec. --Done.
1. multi-threading is NOT effecient. -- Resolved by spinlock
2. too slow in rendeirng large database.
openmesh or TRIANGLE for geometry approximation?
A Rapid Hierarchical Rendering Technique for Translucent Materials