robopal.envs.manipulation_tasks.robot_manipulate module¶
- class robopal.envs.manipulation_tasks.robot_manipulate.ManipulateEnv(robot=None, render_mode='human', control_freq=20, controller='CARTIK', is_interpolate=False, action_type='velocity', is_randomize_end=True, is_randomize_object=True, is_show_camera_in_cv=False, is_render_camera_offscreen=False, camera_in_render='frontview', camera_in_window='free')[源代码]¶
基类:
RobotEnvThe control frequency of the robot is of f = 20 Hz. This is achieved by applying the same action in 50 subsequent simulator step (with a time step of dt = 0.0005 s) before returning the control to the robot.
- compute_rewards(achieved_goal: ndarray = array([0., 0., 0.]), desired_goal: ndarray = array([0., 0., 0.]), info: dict = None, **kwargs)[源代码]¶
Sparse Reward: the returned reward can have two values: -1 if the block hasn’t reached its final target position, and 0 if the block is in the final target position (the block is considered to have reached the goal if the Euclidean distance between both is lower than 0.05 m).
- reset(seed=None, options=None)[源代码]¶
Reset the simulate environment, in order to execute next episode.
- set_random_init_position()[源代码]¶
Set the initial position of the end effector to a random position within the workspace.
- step(action) Tuple[源代码]¶
Take one step in the environment.
- 参数:
action -- The action space is 4-dimensional, with the first 3 dimensions corresponding to the desired
linear velocities of the end effector in Cartesian coordinates, and the last dimension corresponding to the desired gripper state (0 denotes closed, 1 denotes open). :return: obs, reward, terminated, truncated, info